This has been the case for people who buy into hype and don’t actually use the products, but I’m pretty sure people who do are pretty disillusioned by all the claims. The only somewhat reliable method is to test the things for your own use case.
That said: I always expected the tradeoff of Spark to be accuracy vs. speed. That it’s still significantly faster at the same accuracy is wild. I never expected that.
when it reaches the main 5.3 codex efficiency at this token rate this kind of articles will seem silly in retrospect
I don't agree with this premise. I think it is fair to say that Haiku is a faster model than Opus.
Why not just benchmark the models yourself?
Tiny little YouTube channels will spend weeks benchmarking every motherboard from every manufacturer to detect even the tiniest differences!
Car reviews will often test drive the cars and run their own dyno tests.
Etc…
AI reviews meanwhile are just copy-paste from the market blurb.
Because their incentives are to churn stupid articles fast to get more views, and to be on major AI companies and potential advertisers' good graces. That, and their integrity and passion for what they do is minimal, plus they're paid peanuts.
Doesn't help that most brain-rotted readers are hardly calling them out for it, if they even notice it.
Ideally journalists / their employers would swallow that as the cost of business, but it's a hard sell if they are feeling the squeeze and aren't making much in the first place.
There’s two ways to run this, and I’m curious which is better (time or quality, either would be interesting) - you could run 5.3xxhigh as the coordinator, spinning up some eager beaver coders that need wrangling, or you could run spark as the coordinator and probably code drafter - where it runs into trouble it could farm out to the big brains.
Now that I think about it, corporations use both models as well. It would be nice for the user if fast coordinator worked well; that lowers turns and ultimately could let you stay in the zone while pairing with a coding agent. But I really don’t know which is better.
It looks like it only appears in the snippet the Google result shows, presumably taken from the meta tags. It's possible an earlier draft claimed a 15x speed boost and they forgot to remove the claim from the tags.
nvanlandschoot•1d ago