https://huggingface.co/collections/Qwen/qwen3-67dd247413f0e2...
And the MoE 30B one has a decent shot at running OK without GPU. I'm on a 5800x3d so two generations old and its still very usable
Qwen3-235B-A22B has 118 `.safetensors` files at 4GB each.
There are a bunch of models and quants between those.
I agree, the advantage of qwen3's family is a plethora of sizes and architectures to chose from. Another one is ease of fine-tuning for downstream tasks.
On the other hand, I'd say it's "in spite" of their benchmarks, because there's obviously something wrong with either the published results, or the way they measure them, or something. Early impressions do not support those benchmarks at all. At one point they even had a 4b model be better than their prev gen 72b model, which was pretty solid on its own. Take benchmarks with a huge boulder of salt.
Something is messing with recent benchmarks, and I don't know exactly what but I have a feeling that distilling + RL + something in their pipelines is making benchmark data creep into the models, either by reward hacking, or other signals getting leaked (i.e. prev gen models optimised for one benchmark are "distilling" those signals into newer smaller models. No, a 4b model is absoulutely not gonna be better than 4o/sonnet3.7, whatever the benchmarks say).
Gobbling up rising brands kept their finances going for a while, but the grand Metaverse pivot was clearly their (much struggling) attempt to invent their own titanic platform akin to Android or iPhone.
With that not gaining as much traction as they wanted as quickly as they wanted, they're still on the hunt, as here.
Like, literally building smart homes.
Locally intelligent in ways that enable truly magical smart home experiences while preserving privacy and building trust.
But connected in ways that facilitate pseudo-social interactions, entertainment, and commerce.
Meta's biggest competitors are Apple and Amazon. This is the first clear opportunity they've had to leapfrog both.
I'm earnestly not sure what Meta are less qualified for. Building physical homes or building privacy & trust.
The outsized public hatred toward Meta is almost entirely driven by a bureaucratic, anti-technology Europe (that has finally realized that their overstepping is hurting their future) and a US political institution that needed someone to demonize to keep us all distracted.
There are very good reasons to dislike Meta and Meta products. But they're likely not the ones you're referring to.
The value in SE is mostly B2C, instead of marketplace feature, most local tiny(or even big ones willing to evade tax by not having any physical presence) businesses will open a small business or general page and publish their wares as posts. Lives will be used to demo products or services now and then. People follow these pages and flock over to buy things.
In a sense, Facebook and Whatsapp are like Amazon/Aliexpress of SE Asia. I was there for 5 months visiting a friend(and recovering from burnout), and number of people using such pages to sell anything from basic clothing to food to services are HUGE! It is literally a huge business hub for people to discover and make online purchases. In summary, Facebook pages are the e-commerce front(due to lack of shopify/amazon and similar operators who can handle logistics and payments) for individual businesses.
There were many journalist reports about this phenomenon several years back, but I am too sleepy and tired to link those.
That is the economic structure of their business model.
Now juice that model with $ billions of revenue and $ trillions in potential market cap for shareholders, who demand double digit percentage growth per year.
That defines the scale of available resources to drive the business model forward.
This is a machine designed to scale up and maximally leverage seemingly small conflicts of interest into a global monster that feeds on mental and social decay.
——
Of course, it benefits Facebook and customers to mix in as much genuine side products and services with real value as possible.
But that only wedges the destructive core into individual lives and society even more.
Now add AI algorithms to their core competencies of surveillance integration and psychological manipulation, and to the side value honey features.
We are getting Stockholm’ed and stewed in a lot of high walled slow cookers these days.
Offline first locally hosted AI household assistants.
The problem, in my opinion, is that MZ/CC/AA-D, are feeling that they have to be releasing models of some flavor every month to stay competitive.
And when you have the rest of the company planning to throw you a on-stage party to announce whatever next model, and the venue and guests are paid for, you're gonna have the show whether the content is good or not.
Llama program right now is "we must go faster." But without a clear product direction or niche that they're trying to build towards. Very little is said no to. Just be the best at everything. And they started from behind, how can you think you're gonna catch up to 1-2 year head start, just with more people? The line they want to believe is "the best LLM, not just the best OSS LLM".
Because of the constant pressure to release something every month (nearly, but not a huge exaggeration), and the product direction coming from MZ himself, the team is not really great at anything. There is a huge apparatus of people working on it, yet half of it or more, I believe, is baggage required because of what Meta is.
I guess we'll see how long this can be maintained.
For my task, Llama 3.3 was still the best local model I could run. I tried newer ones (Phi4, Gemma3, Mistral Small) but they produced much worse results. Some larger local models are probably better if you have the hardware for them, but I only have a single 4090 GPU and 128 GB of system RAM.
Here's the training code that I used to fine-tune ModernBERT from the ~5000 pages I had labeled with Llama 3.3. It should be a good starting point if you have your own fine-tuning task like this. If you can get away with a smaller context than I used here, it will be much faster and the batches can be larger (requires experimentation).
Don't take the "activity" of those places as gospel, try the models on your own stacks, with your own benchmarks for best results.
But if you have an Android phone, Gemini on that phone is far superior. And if you have Apple, well maybe that's all the people who use it
I would have imagined such a thing would be smaller and thus run on smaller configurations.
But since I am only a layman maybe someone can tell me why this isn't the case?
It's just a fancy name for sparse evaluation of the total network to save compute and memory bandwidth.
Also, the software you’re working on will generally in some way have a real-world domain - without knowing it the AI all likely be a less effective assistant. Design conversations with it would likely be pretty non-fun, too.
Finally, the “bitter lesson” article[0] from a couple years ago is I think somewhat applicable too.
[0]: http://www.incompleteideas.net/IncIdeas/BitterLesson.html
Meta does have some specialized models though, llamaguard was released for llama 2 and 3.
The expensive part is building the dataset, training itself isn't too expensive (you can even fine-tune small models on free colab instances!), and when you have your dataset, you can just fine tune the next generalist model as soon as it's released and you're good to go ago.
0. Introducing Llama API in preview
This one is good but not centre stage worthy. Other [closed] models have been offering this for a long time.
1. Fast inference with Llama API
How fast? and how must faster than others? This section talks about latency and there's absolutely no numbers in this section!
2. New Llama Stack integrations
Speculations with 0 new integration. Llama Stack with NVIDIA had already been announced and then this section ends with '...others on new integrations that will be announced soon. Alongside our partners, we envision Llama Stack as the industry standard for enterprises looking to seamlessly deploy production-grade turnkey AI solutions.'
3. New Llama Protections and security for the open source community
This one is not only the best on this page, but is actually good with announcement of - Llama Guard 4, LlamaFirewall, and Llama Prompt Guard 2
4. Meet the Llama Impact Grant recipients
Sorry but neither the gross amount $1.5 million USD, nor the average $150K/recipients is anything significant at Facebook scale.
It's ironic that China is acting as a better good faith participant in open source than Meta. I'm sure their stakeholders don't really care right now, but Meta should switch to Apache or MIT. The longer they wait the more invested people will be and the more intense the outrage when things go wrong.
The whole “licensing” stuff on language model is a scam, or more precisely, an attempt to create a new kind of IP laws from thin air.
What's protected is the content of the movie, and it's protected because it derives from human creativity.
> The copyright law only protects “the fruits of intellectual labor” that “are founded in the creative powers of the mind.”
> […]
> Similarly, the Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.
source: https://www.copyright.gov/comp3/chap300/ch300-copyrightable-...
Because that would be a derivative work.
>the content of the movie
Which exists as a binary blob. Copying that binary blob requires a license to do so.
No, derivative work require human creativity themselves. Compiling or re-encoding still doesn't count.
See : https//www.law.cornell.edu/uscode/text/17/101
A work consisting of editorial revisions, annotations, elaborations, or other modifications which, as a whole, represent an original work of authorship, is a "derivative work".
> Which exists as a binary blob.
Nope, for copyright protection it must exist at least as one binary blob, but having multiple binary blobs (with different resolutions) doesn't make it a different copyright piece. It's the underlying creation that is protected, not a particular instance of it. Star Wars, the Empire Strikes Back is what's registered at the Copyright Office, not Star_Wars_The_Empire_Strikes_Back.720p.avi.
> Copying that binary blob requires a license to do so.
Fortunately no, otherwise your internet provider would need a license from the copyright holders to copy the blob from Netflix server to your machine.
One last time: copyright isn't about the blob, it's about the creation stored on it. The process of creating the blob doesn't grant you any copyright protection of you don't own the underlying material.
Then it would just be a copy then. Copies need a license.
>Fortunately no, otherwise your internet provider would need a license from the copyright holders to copy the blob from Netflix server to your machine.
No, I believe this is because internet providers do not save the content which means that a copy is not considered to be made. If copies were allowed of binary blobs people could legally make pirate sites sharing copies like that.
Nope, that's not the reason, and that's why you don't need to give a copyright license to Apple before storing your personal pictures to iCloud either, nor does Apple need a license to store copyrighted material you got a license for (like software or paid downloaded movies). Copying a blob isn't a license infringement in itself, because the blob itself was never protected by copyright.
> If copies were allowed of binary blobs people could legally make pirate sites sharing copies like that.
No, because sharing is what you'd get prosecuted for.
Part of me thinks you should really try to start learning the basis of stuff before arguing on the internet about it, but who am I to judge your life choices. I did my best to help you learn something, but if you refuse to there's nothing more I can do.
You do which is why it's a part of the terms of service for icloud.
https://www.apple.com/legal/internet-services/icloud/en/terc....
>No, because sharing is what you'd get prosecuted for.
Copyright controls both reproduction and distribution.
>start learning the basis of stuff before arguing on the internet about it
You are being unnecessarily smug and condescending.
404
> Copyright controls both reproduction and distribution.
Reproduction in the copyright sense isn't about blob copying. RAID 1 isn't a copyright infringement either… And neither is a Windows defragmentation (which is just the OS copying files around).
> You are being unnecessarily smug and condescending.
You are needlessly obstinate on a topic you don't understand.
The training data is all scraped from the internet, ebooks from libgen, papers from Sci-Hub, and suchlike.
They don't have the right to redistribute it.
I agree with you that their license is not open source, but model weights are not binary blobs! Please stop spreading this misconception.
Its all Ai all the time now though, not seen any mention of our reimagined future of floating heads hanging out together in quite some time.
I recently travelled and needed to work (coding and video editing in DaVinci) a lot in hotels and random places. I can't bring large screens everywhere (and I hate to work with small fonts and screens), and Quest 3 was a perfect fit here. Sometimes at home or office (I have a private one), I just don't want to sit on my buttoks all the time, so I put on VR goggles and can keep working in any position (lying on a sofa or even sunbathing outdoors).
As soon as new XR/MR glasses become lighter (there are some good ones already - Visor, Beyond BigScreen 2, etc), more and more people will discover how usable and optimized for work this tech is.
The trepidation behind VR for professional applications makes sense to me - it's expensive and tough to compare with what it's replacing. As a pure vehicle for fun though, I genuinely have no regrets with my Quest hardware. It was easily a better purchase than my Xbox One.
It looks goofy but gives you a massive battery, and the fan is nice as well.
(There are other critiques of AVP that might not make it the right choice, but desktop work experience isn’t one of them.)
But it’s obvious that total weight does matter (eg a 1 ton weight would crush your spine) but below some threshold it’s more about weight distribution and balance than total weight.
The first one can't be fixed without removing weight.
The second one can be fixed by adding more weight with the total weight not being the cause of the problem but rather the distribution of the total weight.
Do you put a keyboard on your stomach or something?
I would’ve hoped to have seen Meta, in their supposed dedication to open source, actually fix it.
1. Llama API Preview: Launched a limited preview of the Llama API, a developer platform simplifying Llama application development with easy API key creation, playgrounds, SDKs, and tools for fine-tuning and evaluation. It emphasizes model portability and privacy.
2. Fast Inference Collaborations: Announced collaborations with Cerebras and Groq to offer developers access to faster Llama model inference speeds via the Llama API.
3. Expanded Llama Stack Integrations: Revealed new and expanded Llama Stack integrations with partners like NVIDIA, IBM, Red Hat, and Dell Technologies to make deploying Llama applications easier for enterprises.
4. New Llama Protection Tools & Program: Released new open-source security tools including Llama Guard 4, LlamaFirewall, and Llama Prompt Guard 2, updated CyberSecEval 4, and announced the Llama Defenders Program for partners to help evaluate system security.
5. Llama Impact Grant Recipients: Announced the 10 international recipients of the second Llama Impact Grants, awarding over $1.5 million USD to support projects using Llama for transformative change.
Overall, the announcements emphasize making Llama more accessible, easier to build with, faster, more secure, and supporting its diverse open-source community.
For example, the entire first sentence could be collapse to “Announcements”. Some sentences are just pablum entirely.
It’s helpful to post summaries here, but they need to be curated.
mohsen1•9mo ago
htrp•9mo ago