And the Chinese have been a huge source of innovation in the field.
They have been a source of innovation but probably not in training them.
It was much easier when companies had models on the /completion style APIs, because you could actually get the logits for each generation step, and use that as a dataset to fit your model to.
That isn't to diminish the efforts of the Chinese developers though, they are great.
My intuition that one need ALOT api credits to distill such large models.
And I guess the idea is is that there is this extreme inflection point in utility somewhere that makes it so getting there first gives you some incredible economic edge.
It might not exist though. Like either utility plateaus and its bubble crash of the century time or it just keeps going up but without any specific point where you can differentiate something.
What yes, it's clear by now it's way beyond the capacity of those AIs, and the odds are pretty good it's impossible to a large extent (but some limited version of it may be possible).
We "distilled" modern cars from Model-T. You still driving the car that was "first" off an assembly line?
This is normal improvement to manufacture of stuff. Your handwavy "it was first so its winner-winner chicken dinner!" is little more than your personal form of expression.
Since LLMs are a distill of web content, by your poorly defined metric, you must not value LLMs and AI companies? The content already existed! They're just a new indexing tool.
Each wizard school also seems to take a different approach and have different goals. Soon people will benchmark lawyers with Lego.
The machines these models run on are well known. They’re not black boxes. The results will be same-y despite the timeline, process, companies took to get there being different.
UPS trucks may carry different sizes and shapes of packages day to day but their upper bounds on total weight and geometry exist too.
A Honda and Ford can look different, but physical reality, whether the measure is user feedback (human biology exists is physical) or physics itself, still results in very same-y 4 wheels, etc etc.
What's strange to me is all the software engineers who ignore physics. All of our applied knowledge that gives rise to software engineering also constrains the outcomes. Our ability to sit down every day and arbitrarily slice up data in various ways is very much constrained by physics like everything else.
The easy money/endless hype era of ZIRP where SWEs failed up thanks to endless employment opportunities has resulted in way too many SWEs believing their efforts on some trivial shit like a JS framework, or some CSS designs is propelling humans into the future.
Nah, it's just physics as usual. You alls sensory memory is just parroting the crap it memorized.
Doesn't matter: if they're good enough and cheaper, they'll sink the US model-makers eventually. The free market demands it.
The US invented solar panels, and led in solar panel tech for a long time. Who leads in solar panel tech now?
China has a playbook for de-industrializing its capitalist rivals. If we leave MBAs and free-marketers in power, China will "come to dominate all technologies, including A.I., and ... America [will] export little more than soybeans and corn" (https://www.nytimes.com/2025/12/17/opinion/trump-ai-chips-nv...).
This kinda sounds like you're talking about Trump, but I think the problem predates him and is far deeper. If anything, Trump is a spastic reaction to the deeper problem. He won because his rhetoric gestured in the direction of fixing the problem, but he's too incompetent to pull it off (and the bulk of the competent people don't want to fix the problem for ideological reasons).
This is how you go from stability to world wars. A couple of rich guys got together and decided they were going to redraw all of the maps and toss the rulebook overboard, and it is for the most part going their way. People are being executed and the useful idiots are falling over each other to defend it.
If you had told me in 1999 that this would happen by 2026 I would have happily declared you mad, but here we are.
It's way deeper than that, though. It's stuff like US businessmen choosing to literally move the US's rare-earth magnet production capacity to China, teaching China how make them in the process (https://www.nytimes.com/2025/12/31/business/china-rare-earth...). It's the US knowing about it's rare-earth vulnerability for a decade or more but being completely unable to do anything about it. It's the US losing other strategic capabilities like large-scale electronics manufacturing capacity, and people being totally fine with that because "cheaper iPhones, more margins, good deal!"
But the singular focus on destruction of what is a cornerstone of the stability of the Western hemisphere is absolutely unprecedented. And to see so many people falling for it, hook line and sinker. They are so blasted with crazy things that they no longer see anything strange at all about each and every day's happenings and they even jump to defend the absolutely indefensible, something that probably would have been - rightly - horrified by less than a decade ago is now perfectly normal.
> But the singular focus on destruction of what is a cornerstone of the stability of the Western hemisphere is absolutely unprecedented. And to see so many people falling for it, hook line and sinker.
IMHO, that short-term thinking over such a long span laid the groundwork for that destruction.
I read this the other day, and I think it's an interesting take that rings true:
https://www.nytimes.com/2026/01/06/opinion/trump-presidentia...:
> Instead of comparing what is happening under Trump with the situations in Hungary, Turkey and Russia, Goldstone argued that conditions in the United States are,
>> ironically, more like what happened in Venezuela, where after a century of reasonably prosperous democratic government, decades of elite self-serving neglect of popular welfare led to the election of Hugo Chávez with a mandate to get rid of the old elites and create a populist dictatorship.
>> I find that decades-long trends in the U.S. — stagnating wages for non-college-educated males, sharply declining social mobility, fierce political polarization among the elites and a government sinking deeper and deeper into debt — are earmarks of countries heading into revolutionary upheaval.
>> Just as the French monarchy, despite being the richest and archetypal monarchy, collapsed in the late 18th century because of popular immiseration, elite conflicts and state debts, so the U.S. today, despite being the richest and archetypal democratic republic, is seeing its institutions come under attack today for a similar set of conditions.
No, because it's not a problem of economic development, but political ideology.
China's political priority is technological dominance and capability, and it views free markets as a tool subordinate to those goals. The US's political priority is financial wealth, and an extreme ideological attachment to free markets that overrides most other priorities. The US has an ideological vulnerability that China is well-positioned to exploit.
This problem goes well beyond Trump, and has roots that are very deep.
Lest you forget: China is controlled by the CCP and is not a democracy. It will not affect political priorities if "more Chinese become middle class or wealthy" and "view things differently." The Chinese political system does not answer to them, and will only throw them a bone if there's a major threat to stability.
You're echoing the 90s-era hope that free markets would bring political liberalization to China, but history has debunked that idea.
What do you mean, exactly?
China isn't the USSR: they're not wedded to central planning. They've figured out how to use capitalism while keeping it squarely under their control. Arguably, they're playing the "capitalism" game more successfully than "capitalist" countries like the US.
When you compare the US to China, it's the US that looks sclerotic, like the USSR once did.
Once of America's big weaknesses is a common lazy assumption we'll always be at the top, so we don't respond to challenges until its too late. Then we tell ourselves some reassuring story and comfort ourselves by gazing at one of the few remaining industries where we're still ahead. I'm pretty sure if the US and China got into a conflict, the US would get its ass kicked like the Nazis and Japanese did during WWII, and for similar reasons.
Things can change.
Yeah, but it's more likely the US will collapse like the USSR than it is for China to collapse. The big reason the USSR collapsed was its economic output couldn't keep up, and it couldn't afford the competition anymore.
China's mostly caught up technologically to the US. It's ahead or pulling ahead in many areas. It's production capacity is way ahead. Without Chinese production propping up the US, US stores would probably feel a lot like late-Soviet stores, with bare shelves and not enough products to satisfy demand.
Certainly not ruling that out.
> Without Chinese production propping up the US, US stores would probably feel a lot like late-Soviet stores
I don't agree here. Without Chinese production, we'd simply still be producing stuff here.
No, the US can't anymore. The supply chains have moved to China now, and needed capital equipment and know-how has mostly been lost in the US. It would take a massive investment to get back to "simply ... producing stuff here."
And if anyone tries, the chorus of "muh iPhone expensive!" would be deafening, the politicians would retreat and go back to bickering about the culture war and plotting their next attack ad, and the businessmen would go back to counting their money.
For the size/performance yes.
> In any case, they wouldn't exist if not for superior models they were distilled from.
So? Those models wouldn't exist without the sum total of human knowledge. As long as a work is transformative why does it matter?
Measured by the DCI the Chinese AI models are about 1.5 years ahead of US models.
DCI = Dust42 Capability Index: MBP Max 64GB, Qwen3-80B MLX 4bit quant, 40 tokens per second. It is not on Claude Opus level but very, very useful if you have no internet, i.e. on a flight. And occasionally it surpasses even Opus by far and large. Opus is a pain in the neck once the coding task at hand surpasses its capabilities. Qwen3 is much better to guide to get step by step to a solution.
My theory is that these models serve the purpose of being relatively easy to run/tweak for researchers, and mainly serve to demonstrate the effectiveness of new techniques in training and inference, as well as the strength of AI labs that created them.
They are not designed to be state of the art commercial models.
By choosing bigger model sizes, running more training epochs, and drilling the models a bit more on benchmarking questions, I'm sure the Chinese could close the gap, but that would delay these models, make them more expensive and harder to run without showing any tangible research benefit.
Also my 2c: I was perfectly happy with Sonnet 3.7 as of a year ago, if the Chinese have a model really as good as that (not only one that benchmarks as well), I'd definitely like to try it.
GLM-4.7 like a mix of Sonnet 4.5 and GPT-5 (the first version not the later ones). It has deep deep knowledge, but it's often just not as good in execution.
They're very cheap to try out, so you should see how your mileage varies.
Ofcourse for the hardest possible tasks that GPT 5.2 only approaches, they're not up to scratch. And for the hard-ish tasks in C++ for example that Opus 4.5 tackles Minimax feels closer, but just doesn't "grok" the problem space good enough.
Adding behind after lag as a verb is more of a "because it sounds good", perhaps as a subconscious way to emphasize the verb, but it isn't a grammatical requirement at all.
Leaving it off is almost certainly more to keep the headline short than anything else.
Note also that these aren't really questions of grammar (syntax) but meaning (semantics). Does "lagged" mean the same thing as "trailed" in this kind of construction? It didn't some decades ago, but maybe it does today. Or will tomorrow.
For me, there are three idiomatic forms:
1. Using "lag behind" gives a target/reference as a prepositional relationship, not as an object of the verb "to lag".
2. Using "caused to lag" allows one to specify a causal agent, but again not as an object of the verb "to lag".
3. Using "lag" alone is a subject-verb construct, leaving an implicit target/reference from context expectations. A coach or supervisor might scold someone for lagging.
As a bit of a tangent, I actually wonder if the etymology of "to lag" is more Germanic than some people assume. The verb lagern has many uses for placing, storing, and leaving behind. It's where our English concept of a "lager" beer comes from too, referencing the way the beer is fermented in (cold) storage. If this linguistic connection remained fresh, we might think of an SVO construct of lagging as the opposite of the intent in this article. The leader would lag the follower by leaving them behind!
It's thieves all the way down.
All frontier US models are closed weight. It's great what Chinese are doing because open weights help everyone. Also there is a lot of research thanks to these open weights, look how much research is being done using Qwen models in US (Microsoft etc) and in the rest of the world.
Have you seen Manifold Constrained Hyper Connections (mHC) paper from a few days ago from Deepseek? Projects residual connection space onto a constrained manifold to keep identity mapping properties while enabling richer internal connectivity, so basically it eliminates a huge problem.
They also released A LOT of training tricks and innovation around optimizing inference and training.
As to other industries:
"China leads research in 90% of crucial technologies — a dramatic shift this century" [1]
And here's[2] "China Is Rapidly Becoming a Leading Innovator in Advanced Industries", a big report on where they lead and how.
1. https://www.nature.com/articles/d41586-025-04048-7
2. https://itif.org/publications/2024/09/16/china-is-rapidly-be...
anishgupta•19h ago
meisel•18h ago
anishgupta•16h ago
They use H800 as opposed to major US ones on H100 (2-3x faster)
eptcyka•18h ago
jacquesm•17h ago
What doesn't kill you really does make you stronger.
ilamont•17h ago
For American and other non-PRC companies thinking of using Chinese models, doesn't this have to be balanced with the risk that the US or its leadership may kneecap the Chinese models through export controls, an executive order, or some other means?
1970-01-01•16h ago