frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Reverse-engineering the RK3588 NPU: Hacking limits to run vision transformers

https://amohan.dev/blog/2025/shard-optimizing-vision-transformers-edge-npu/
51•rcarmo•1d ago

Comments

jauntywundrkind•1d ago
Epic hacker work!

For what it's worth, it seems like there's a bunch of open source NPU work in progress too. There's a layer "TEFLON" for Gallium3D shared by most of these drivers, that TensorFlow can use. Then hardware drivers for Rockchip (via ROCKET driver), and Vivante (with their Etnaviv drivers). It'd be extra interesting now to see how (or if?) they've dealt with the system constraints (small scratchpad size) here. https://www.phoronix.com/news/Gallium3D-Teflon-Merged https://www.phoronix.com/news/Rockchip-NPU-Linux-Mesa https://www.phoronix.com/news/Two-NPU-Accel-Drivers-2026

poad4242•7h ago
> *Thanks! I actually tracked the Teflon/ROCKET driver work closely during my initial research (it was the 'Plan B' in my original proposal if the vendor blobs failed entirely).* >

> *The main reason I stuck with the closed-source `rknn` stack for this specific project was operator support for Transformers. Teflon is getting great at standard CNN ops (Fused ReLU, Convs, etc.), but the SigLIP vision encoder relies on massive Transposes and unbounded GELU activations that currently fall off the 'happy path' in the open stack.*

> *To your point on the system constraints (small scratchpad): I suspect the current open-source drivers would hit the exact same 32KB SRAM wall I found. The hardware simply refuses to tile large matrices automatically. My 'Nano-Tiling' fix was a software-level patch; porting that logic into the Mesa driver itself would probably be the 'Holy Grail' fix here.*

Neywiny•1d ago
This is good work. I would say that there was very little reverse engineering but that's fine. It's interesting seeing some companies look at ARM's Ethos line as holding them back and others as it pulling them forward. I'm not sure if ARM is the best solution, but all these different NPUs feels a bit like the early CPU architecture and compiler days. Hopefully we can make it through unscathed so at least we get better error messages or maybe even compilers that know those kinds of idiosyncracies enough to avoid such things.
kvuj•1d ago
Awesome! Finally putting back "Hacker" in "Hacker News".
doctorpangloss•1d ago
hacker news needs a reprieve from "Problem. The fix? Vibe coding session. Here's the ChatGPT report"
poad4242•7h ago
I understand the frustration with AI-written posts lately, but this was the opposite of that. It took months of hard work and many late nights. While the hardware manual (TRM) is public, it doesn't explain how to handle the strict 4KB memory bank limits. I had to figure out how to shard and tile the model because the hardware won't let you store data across those banks without crashing. It was a long battle with memory constraints to get that 15x speedup.
PunchyHamster•1d ago
we need RISC-V equivalent but for NPUs, it's become a royal mess last few years
Neywiny•1d ago
It's starting. Some designs are moving towards very wide vector length (1k maybe even 2k?) RV-V cores. So less a giant matrix multiplication unit (I think TI has some parts with what they literally call MMUs, great work guys), more a bunch of DSP heavy CPUs. In the age of x86 splitting on AVX-512, it's interesting.
poad4242•7h ago
Hello! Author of the post here, happy to answer questions about the process. I have a draft white paper that details more of the process. Let me know if I should put it up on github or arxiv.

Electric soup cup that enhances salty taste

https://www.japantimes.co.jp/news/2025/09/10/japan/kirin-electric-cup-enhances-salty-taste/
1•geox•37s ago•0 comments

Mini-SGLang: A lightweight yet high-performance inference framework for LLM

https://github.com/sgl-project/mini-sglang
1•limoce•3m ago•0 comments

US admits liability in helicopter collision with American jet that killed 67

https://www.cnbc.com/2025/12/17/us-army-helicopter-collision-american-airlines-jet.html
2•pseudolus•6m ago•0 comments

Show HN: Solance – Discover Music Through Friends

https://www.solance.app/
1•Solance•11m ago•0 comments

I made a simple form for restaurants and cafes to list their location on OSM

https://localcafe.org/locations/submit/new
1•fullstacking•12m ago•2 comments

Tucker claims Trump will declare war on Venezuela tonight address to nation

https://www.dailymail.co.uk/news/article-15393611/venezuela-trump-address-nation-oval-office-war....
2•Bender•13m ago•4 comments

Apple Announces Changes to iOS in Japan

https://www.apple.com/newsroom/2025/12/apple-announces-changes-to-ios-in-japan/
3•soheilpro•17m ago•0 comments

We Built GPT Image 1.5 Because AI Image Generators Still Suck

https://loraai.io/es/gpt-image-15
1•xbaicai•18m ago•2 comments

English has become easier to read

https://www.worksinprogress.news/p/english-prose-has-become-much-easier
1•pseudolus•21m ago•0 comments

Shiver Offroad lives and operates on the move

https://www.expandable-trailers.com/na/news/shiver-offroad-mansion
1•gnabgib•22m ago•0 comments

Coinbase adds prediction markets and stocks in push to be one-stop trading app

https://www.cnbc.com/2025/12/17/coinbase-prediction-markets-stock-trading-stablecoins.html
2•PieUser•22m ago•0 comments

How China built its Manhattan Project rival west AI chips

https://www.reuters.com/world/china/how-china-built-its-manhattan-project-rival-west-ai-chips-202...
4•markus_zhang•24m ago•0 comments

Waymo and Tesla's self-driving systems are more similar than people think

https://www.understandingai.org/p/waymo-and-teslas-self-driving-systems
1•cusaitech•24m ago•1 comments

Show HN: Made a plugin where you can generate consistent illustrations in Figma

https://www.figma.com/community/plugin/1582491263591994064/ilus-ai-consistent-illustration-generator
1•Kristjan_Retter•24m ago•0 comments

Top Quality Sportswear Training Wear Available Now

https://info.lidongsports.com/top-quality-sportswear-training-wear-available-now-42154.html
1•alicebossgoo•29m ago•0 comments

How Much Water Do AI Data Centers Use?

https://undark.org/2025/12/16/ai-data-centers-water/
1•EA-3167•29m ago•0 comments

Ask HN: Alternative to lean domain search without .blog

https://2worddomain.com/
1•aleks5678•30m ago•1 comments

Dan Bongino announces he's QUIT FBI to return to popular talk show

https://www.dailymail.co.uk/news/article-15393967/Dan-Bongino-announces-hes-QUIT-FBI-return-popul...
1•Bender•36m ago•0 comments

The ATF's digitization of 900M gun sales records and the legal debate around it

https://medium.com/statute-circuit/the-atfs-quiet-digital-transformation-and-why-it-matters-8a10a...
2•delschlangen•37m ago•0 comments

Definitions of "AI" differ and why advanced systems might escape both

https://medium.com/statute-circuit/the-definitional-loopholes-that-could-let-advanced-ai-escape-r...
1•delschlangen•38m ago•0 comments

Miriam Adelson pledges $250 million for a third Trump term at White House party

https://www.jpost.com/american-politics/article-880539
3•sipofwater•39m ago•1 comments

Made a Portfolio Diversifier

https://portfolio-rebalancer-79xcbdo3g-lucas-projects-bb7f3725.vercel.app/
1•mangonese•39m ago•1 comments

Why does Windows take the scenic route converting from CF_TEXT to CF_OEM­TEXT?

https://devblogs.microsoft.com/oldnewthing/20251216-00/?p=111873
2•zdw•40m ago•0 comments

Exchange Online Mailbox Audit Logs "On" might not mean "On"

https://twitter.com/itguysocal/status/2001383890574020648
1•fowl2•40m ago•1 comments

Is Cognitive Dissonance Actually a Thing?

https://www.newyorker.com/culture/the-lede/is-cognitive-dissonance-actually-a-thing
2•mitchbob•40m ago•2 comments

Cisco decides its homegrown AI model is ready to power its products

https://www.theregister.com/2025/12/17/cisco_foundation_model_indentity_intelligence/
1•Bender•41m ago•0 comments

Typosquatting in Package Managers

https://nesbitt.io/2025/12/17/typosquatting-in-package-managers.html
1•zdw•42m ago•1 comments

Umm so I created a black hole

https://iamdinakar.github.io/blackhole/
2•DinakarS•42m ago•0 comments

Sanders: Pushing for a moratorium on AI data centers

https://twitter.com/sensanders/status/2001057004370948131
2•tekacs•44m ago•0 comments

Gut bacteria have evolved rapidly to digest starches in ultra-processed foods

https://newsroom.ucla.edu/releases/gut-bacteria-evolved-rapidly-digest-starches-ultra-processed-f...
2•strangeloops85•44m ago•0 comments