The Siri+LLM features of Apple Intelligence aren’t launched yet, and the other features like notification summaries run on-device.
It somehow looks worse than most scammy image generation apps you see on half-page search ads on the App Store. I have no idea how Apple willingly released it like that.
It was updated on my iPhone to a bland, forgettable abstract icon that’s still fairly mediocre but no longer an ongoing embarrassment for their corporate brand standards.
Just a total failure of execution.
If it even works, it fails with "something went wrong" for me 3 out of 5 times
Siri being better at free form requests for actions and doing internet/knowledge searches is about all I can think of. But also, I use Kagi for that, and unless Siri has a pluggable backend for search I'm not sure being forced to use only Apple's search, if it ever exists, is a great design.
https://www.macrumors.com/2026/01/30/apple-explains-how-gemi...
Samsung desperately wants to be this but misses the part where iPhones don’t come with third party junkware even if they’re entry level models and don’t allow carrier junkware either. Google could be it but they’re too married to midrange hardware and underwhelming physical designs.
All it would take is for a manufacturer to commit to their whole lineup being built with reasonably capable hardware (no ancient or weak SoCs as seen in budget Android devices), to completely jettison third party junkware, and have top end flagships with hardware that actually matches that description, but none thus far have managed this.
Probably not, but a zero junkware/zero carrier meddling policy is a major contributor to the brand's premium image, which makes the whole lineup more desirable. The iPhone is an invariable, singular product no matter how it's obtained, even if it has different price points.
By contrast Samsung, etc undermine themselves by trying to squeeze out pennies anywhere they can. That's the behavior of a commodity, not a premium brand.
This is not a huge disadvantage in my opinion. Let the rest of big tech fight each other to death over cloud, while controlling a very profitable differentiated offering (devices+services). Apple keeps the M series HW out of data centers, even though it presents some very attractive performance/w and per-core numbers.
What I really want is my phone to transcribe all of my phone calls to a Notes document. Since it isn't recording an audio conversation, I don't think the consent laws come into play.
A good analogy would be streaming. To get good quality, sure, you can store the video file but it is going to take up space. For videos, these are 2-4GB (lets say) and streaming will always be easier and better.
For models, we're looking at 100s of GB worth of model params. There's no way we can make it into, say, 1GB without loss in quality.
So nope, beyond minimal classification and such, on-device isnt happening.
--
EDIT:
> Nobody wants to be sending EVERY request to someone else's cloud server.
We do this already with streaming. You watch YouTube that is hosting videos on the "cloud". For latest MKBHD video, I dont care about having that locally (for the most part). I just wanna watch the video and be done with it.
Same with LLMs. If LLMs are here to stay, most people would wanna use the latest / greatest models.
---
EDIT-EDIT:
If you response is Apple will figure it out somehow. Nope, Apple is sitting out the AI race. So it has no technology. It has nothing. It has access to whatever open source is available or something they can license from rest. So nope, Apple isnt pushing the limits. They are watching the world move beyond them.
I'm pretty sure in five years, local LLM will be a thing.
Unless we invent a completely NEW way of doing videos, there's no way you can get that kind of efficiency. If tomorrow we're using quantum pixels (or something), sure 500MB is good enough but not from existing.
In other words, you cannot compress a 100GB gguf file into .. 5GB.
100GB to 5GB would be 20x. Video has seen an improvement of that magnitude in the days since MPEG-1.
It's interesting to consider that improvements in video codecs have come from both research and massively increased computing power, basically trading space for computation. LLMs are mostly constrained by memory bandwidth, so if there was some equivalent technique to trade space for computation in LLM inference, that would be a nice win.
That's not a matter of Moore's Law failing, but short-term capacity constraints being hit. It's actually what you want if Moore's Law is to keep going. It's a blessing in disguise for the industry as a whole.
It could work for Deadmau5 because it is probably popular enough to be part of the model. How about "Hey, play that $regional_artist's cover of Deadmau5" and the model needs to know about "regional_artist", the concept of "cover", where those remixes might be (youtube? soundcloud? some other place).
All of a sudden, it all breaks down. So it'll work for "turn off porch lights", but not for "turn off the lights that's in the front of the house"
This is a paradox right? Handset makers want less handset storage so they can get users to buy more of their proprietary cloud storage while at the same time wanting them to use their AI more frequently on their handsets.
It will be interesting which direction they decide to go. Finding a phone in the last few years with more than 256gb storage is not only expensive AF, its become more of a rarity than commonplace. Backtracking on this model in order to simply get AI models on board would be a huge paradigm shift.
Useful LLM usage involves pushing a lot of private data into them. There's a pretty big difference sending up some metadata about your viewing of an MKBHD video, and asking an LLM to read a text message talking about your STD test results to decide whether it merits a priority notification. A lot of people will not be comfortable with sending the latter off to The Cloud.
And anyway, you already see models like Qwen 3.5 9B and 4B beating 30B and 80B parameter models, which can already run on phones today, especially with quantization.
Benchmarks: https://huggingface.co/Qwen/Qwen3.5-4B
Or pull out the phone and ask "Who's the person I met on X day ..".
Indeed.
But they said 5 years. That's certainly plausible for high-end mobile devices in Jan 2031.
I have high uncertainty on if distillation will get Opus 4.6-level performance into that RAM envelope, but something interesting on device even if not that specifically, is certainly within the realm of plausibility.
Not convinced Apple gets any bonus points in this scenario, though.
Then why does my M4 run models at TOK/s that similar priced GPUs cannot?
1/ No, you don't get Opus 4.6 level on devices with 12Gb of RAM, 7B quantised models just don't get that good. Still quite good mind you, and I believe that the biggest advance to come from mobile AI would be apps providing tools and the device providing a discovery service (see Android's AppFunctions, if it was ever documented well): output quality doesn't matter on device, really efficient and good tool calling is a game changer.
2/ Opus 4.6 is now Opus 4.6+5years and has new capabilities that make people want to keep sending everything to someone else's cloud server instead of burning their battery life
So unless iPhone 20 Pro Max has 100GB of unifieid memory all of this is just pipe-dream. I mean, it wont even have 32GB of unified memory.
- that a bunch of users won't jump ship if Apple stagnates for 5 years
- that a product based on a model with Q12026 SoTA performance would be competitive with products using 2031's models.
- that just having access to good (by 2025/2026 standards) models is the big thing that Apple needs in order for Apple Intelligence to finally be useful.
On that last point, I think the OS/app-level features are almost more important than the model itself. If the model can't _do_ anything, it doesn't really matter how intelligent it is. If Apple sits on their laurels for 5 years, would their OS, built-in apps, and 3rd-party apps have all the hooks needed for a useful AI product?And that is exactly why it won't happen (like that).
abeyer•1h ago
dpoloncsak•1h ago
bobbylarrybobby•1h ago