frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

iPhone 17 Pro Demonstrated Running a 400B LLM

https://twitter.com/anemll/status/2035901335984611412
68•anemll•1h ago

Comments

ashwinnair99•1h ago
A year ago this would have been considered impossible. The hardware is moving faster than anyone's software assumptions.
cogman10•1h ago
This isn't a hardware feat, this is a software triumph.

They didn't make special purpose hardware to run a model. They crafted a large model so that it could run on consumer hardware (a phone).

pdpi•56m ago
It's both.

We haven't had phones running laptop-grade CPUs/GPUs for that long, and that is a very real hardware feat. Likewise, nobody would've said running a 400b LLM on a low-end laptop was feasible, and that is very much a software triumph.

smallerize•37m ago
The iPhone 17 Pro launched 8 months ago with 50% more RAM and about double the inference performance of the previous iPhone Pro (also 10x prompt processing speed).
mannyv•22m ago
The software has real software engineers working on it instead of researchers.

Remember when people were arguing about whether to use mmap? What a ridiculous argument.

At some point someone will figure out how to tile the weights and the memory requirements will drop again.

snovv_crash•11m ago
The real improvement will be when the software engineers get into the training loop. Then we can have MoE that use cache-friendly expert utilisation and maybe even learned prefetching for what the next experts will be.
simopa•1h ago
It's crazy to see a 400B model running on an iPhone. But moving forward, as the information density and architectural efficiency of smaller models continue to increase, getting high-quality, real-time inference on mobile is going to become trivial.
firstbabylonian•1h ago
> SSD streaming to GPU

Is this solution based on what Apple describes in their 2023 paper 'LLM in a flash' [1]?

1: https://arxiv.org/abs/2312.11514

simonw•57m ago
Yes. I collected some details here: https://simonwillison.net/2026/Mar/18/llm-in-a-flash/
zozbot234•35m ago
A similar approach was recently featured here: https://news.ycombinator.com/item?id=47476422 Though iPhone Pro has very limited RAM (12GB total) which you still need for the active part of the model. (Unless you want to use Intel Optane wearout-resistant storage, but that was power hungry and thus unsuitable to a mobile device.)
simonw•25m ago
Yeah, this new post is a continuation of that work.
cj00•59m ago
It’s 400B but it’s mixture of experts so how many are active at any time?
simonw•58m ago
Looks like it's Qwen3.5-397B-A17B so 17B active. https://github.com/Anemll/flash-moe/tree/iOS-App
rwaksmunski•48m ago
Apple might just win the AI race without even running in it. It's all about the distribution.
raw_anon_1111•40m ago
Apple is already one of the winners of the AI race. It’s making much more profit (ie it ain’t losing money) on AI off of ChatGPT, Claude, Grok (you would be surprised at how many incels pay to make AI generated porn videos) subscriptions through the App Store.

It’s only paying Google $1 billion a year for access to Gemini for Siri

detourdog•34m ago
Apple’s entire yearly capex is a fraction of the AI spend of the persumed AI winners.
devmor•16m ago
Which is mostly insane amounts of debt leveraged entirely on the moonshot that they will find a way to turn a profit on it within the next couple years.

Apple’s bet is intelligent, the “presumed winners” are hedging our economic stability on a miracle, like a shaking gambling addict at a horse race who just withdrew his rent money.

qingcharles•16m ago
Plus all those pricey 512GB Mac Studios they are selling to YouTubers.
dzikimarian•16m ago
Because someone managed to run LLM on an iPhone at unusable speed Apple won AI race? Yeah, sure.
naikrovek•11m ago
whoa, save some disbelief for later, don't show it all at once.
causal•38m ago
Run an incredible 400B parameters on a handheld device.

0.6 t/s, wait 30 seconds to see what these billions of calculations get us:

"That is a profound observation, and you are absolutely right ..."

WarmWash•16m ago
I don't think we are ever going to win this. The general population loves being glazed way too much.
baal80spam•11m ago
> The general population loves being glazed way too much.

This is 100% correct!

intrasight•12m ago
Better than waiting 7.5 million years to have a tell you the answer is 42.
pier25•30m ago
https://xcancel.com/anemll/status/2035901335984611412
_air•20m ago
This is awesome! How far away are we from a model of this capability level running at 100 t/s? It's unclear to me if we'll see it from miniaturization first or from hardware gains

Show HN: Threadprocs – executables sharing one address space (0-copy pointers)

https://github.com/jer-irl/threadprocs
1•jer-irl•37s ago•0 comments

Brew cask audit finds apps unmanaged by homebrew

https://github.com/jasonhemann/brew-cask-audit
1•jasonhemann•48s ago•1 comments

Built a free website speed test tool for anyone with a public site

https://veerhost.com/website-speed-test/
3•aiwrita•3m ago•0 comments

The Move Your Agents Will Discover

https://postcorporate.substack.com/p/the-move-your-agents-will-discover
1•gnostikka•3m ago•0 comments

LoCoMo AI Benchmark: 6.4% of answer key wrong, judge accepts 63% of fake answers

https://github.com/dial481/locomo-audit
1•dial481•3m ago•1 comments

How do you trust a new Linux distribution?

https://kron.fi/en/posts/stagex-web-of-trust/
1•RyanSquared•4m ago•0 comments

Mark Zuckerberg Is Building an AI Agent to Help Him Be CEO

https://www.wsj.com/tech/ai/mark-zuckerberg-is-building-an-ai-agent-to-help-him-be-ceo-eddab2d5
1•samaysharma•4m ago•0 comments

How the idea of human superiority over nature was invented

https://www.nature.com/articles/d41586-026-00881-6
1•tzury•4m ago•0 comments

OnlyFans owner Leonid Radvinsky dies of cancer at 43

https://www.reuters.com/world/uk/onlyfans-owner-leonid-radvinsky-dies-cancer-43-bloomberg-news-re...
3•thm•4m ago•0 comments

The next evolution of AI user interfaces

https://nandinfinitum.com/posts/the-next-evolution-of-ai-user-interfaces/
1•nanfinitum•5m ago•0 comments

2D Discrete Fourier Transform fixes rainbows on manga on color eInk Kaleido 3

https://www.youtube.com/watch?v=Dw2HTJCGMhw
1•seam_carver•5m ago•1 comments

Do AI Users Prioritize Accuracy or Speed?

https://1up.ai/blog/ai-users-prioritize-accuracy-over-speed/
1•1up_ai•6m ago•1 comments

AI Safety: A Call for Emotional Integration

https://laudiacay.substack.com/p/ai-safety-a-call-for-emotional-integration
1•claudiarichoux•6m ago•0 comments

When Should a Manager Step In?

https://www.dein.fr/posts/2026-03-17-when-a-manager-should-step-in
1•abnercoimbre•7m ago•0 comments

Applying Nyquist-Shannon Sampling to LLM Prompts

https://tokencalc.pro/comparison
1•mdalexandre•7m ago•0 comments

Steve Jobs Talks iBook, AirPort, and More in Newly Surfaced 1999 Video

https://www.macrumors.com/2026/03/23/steve-jobs-talks-ibook-airport-and-more/
1•thm•8m ago•0 comments

The role of AI companies in large formalisation projects

https://leanprover.zulipchat.com/#narrow/channel/113488-general/topic/The.20role.20of.20AI.20comp...
1•mti•9m ago•0 comments

Tinderbox City

https://www.lrb.co.uk/blog/2026/march/tinderbox-city
1•speckx•9m ago•0 comments

Iran foreign ministry denies Trump's 'good talks' claim

https://abcnews.com/video/131325582/
2•inaros•10m ago•0 comments

Meditation, Language, and LLMs

https://craigmod.com/roden/112/
1•cmod•11m ago•0 comments

DietPi released a new version v10.2

1•StephanStS•11m ago•0 comments

CloudHop – Free GUI to transfer files between 70 cloud services

https://github.com/ozymandiashh/cloudhop
1•ozymandiashh•14m ago•0 comments

Show HN: Story Trainer, a self-guided tool for learning story structure

https://planetofthepaul.github.io/StoryTrainer/
1•minviex•14m ago•0 comments

What Happens If AI Makes Things Too Easy for Us?

https://spectrum.ieee.org/frictionless-ai-psychology
2•Brajeshwar•15m ago•0 comments

Bullet Factory: 3D games created on a prompt

https://ulyssepence.com/blog/post/bullet-factory-3d-games-created-on-a-prompt
1•ulyssepence•15m ago•1 comments

Strait of Hormuz closure hits America's generic drug prescriptions

https://www.cnbc.com/2026/03/16/strait-of-hormuz-closure-generic-drug-prescriptions.html
1•speckx•16m ago•0 comments

Are Corruption and Regulation Less Burdensome in Special Economic Zones?

https://www.mdpi.com/2227-7099/14/2/69
1•PaulHoule•18m ago•0 comments

Remote small business in Africa is an asymmetric opportunity

1•mike_waltrude•19m ago•0 comments

Landmark trial in New Mexico to decide if Meta misled kids about safety risks

https://apnews.com/article/meta-trial-child-sexual-exploitation-5ad9f7bf1ad05bef9d177938e94f0e8b
1•1vuio0pswjnm7•19m ago•0 comments

Intel Fred Can Yield Greater Performance – Fred Benchmarks on Panther Lake

https://www.phoronix.com/review/intel-fred-panther-lake
2•2bluesc•20m ago•0 comments