frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Running a 270M LLM on Android (architecture and benchmarks)

1•ayushranjan99•18m ago
I’ve been experimenting with running small LLMs directly on mobile hardware (low-range Android devices), without relying on cloud inference. This is a summary of what worked, what didn’t, and why.

Cloud-based LLM APIs are convenient, but come with:

-latency from network round-trips -unpredictable API costs -privacy concerns (content leaving device) -the need for connectivity

For simple tasks like news summarization, small models seem “good enough,” so I tested whether a ~270M parameter model gemma3-270m could run entirely on-device.

Model - Gemma3-270M INT8 Quantized Runtime - Cactus SDK (Android NPU/GPU acceleration) App Framework - Flutter Device - Mediatek 7300 with 8GB RAM

Architecture - User shares a URL to the app (Android share sheet). - App fetches article HTML → extracts readable text. - Local model generates a summary. - device TTS reads the summary. Everything runs offline except the initial page fetch.

Performace - ~450–900ms Latency for a short summary (100–200 tokens). - On devices without NPU acceleration, CPU-only inference takes 2–3× longer. - Peak RAM: ~350–450MB

Limitation -Quality is noticeably worse than GPT-5 for complex articles. -Long-form summarization (>1k words) gets inconsistent. -Web scraping is fragile for JS-heavy or paywalled sites. -Some low-end phones throttle CPU/GPU aggressively.

| Metric | Local (Gemma 270M) | GPT-4o Cloud | | ------- | -------------------- | -------------------- | | Latency | 0.5–1.5s | 0.7–1.5s + network | | Cost | 0 | API cost per request | | Privacy | Text stays on device | Sent over network | | Quality | Medium | High |

Github - https://github.com/ayusrjn/briefly

Running small LLMs on-device is viable for narrow tasks like summarization. For more complex reasoning tasks, cloud models still outperform by a large margin, but the “local-first” approach seems promising for privacy-sensitive or offline-first applications. Cactus SDK does a pretty good job for handling the model and accelarations.

Happy to answer Questions :)

Task Based Management [video]

https://www.youtube.com/watch?v=55XUt5Ve6Mk
1•saltysalt•1m ago•0 comments

Designing an Open Source Micro-Manipulator

https://hackaday.com/2025/09/04/designing-an-open-source-micro-manipulator/
1•bariumbitmap•1m ago•0 comments

'If I had colon cancer, I could grow my own tumor, and see which drug kills it'

https://english.elpais.com/science-tech/2025-11-07/hans-clevers-biomedical-scientist-if-i-had-col...
1•PaulHoule•1m ago•0 comments

Export messages and Legacy (Duo) call history

https://support.google.com/meet/answer/16176860?hl=en&co=GENIE.Platform%3DAndroid
1•bariumbitmap•2m ago•0 comments

Tech Is Opt-In

https://coppolaemilio.com/entries/tech-is-opt-in/
1•coppolaemilio•3m ago•0 comments

People test Nano Banana with PDF paper to whiteboard. I did the exact opposite

https://quickchat.ai/post/nano-banana-pro-whiteboard-to-research-paper
1•piotrgrudzien•10m ago•0 comments

Isochrone Curve

https://www.youtube.com/watch?v=eBc827pwKf0
1•n1b0m•10m ago•0 comments

Show HN: Christmas Neovim Theme

https://github.com/ChaseRensberger/christmas.nvim
1•ChaseRensberger•11m ago•0 comments

America's WW2 Combat Drone That Bombed the Japanese – The TDR-1 [video]

https://www.youtube.com/watch?v=9K5v3Jwz89w
1•ForHackernews•14m ago•0 comments

HPC Is Not Just Riding the Coattails of AI

https://www.nextplatform.com/2025/11/21/hpc-is-not-just-riding-the-coattails-of-ai/
1•rbanffy•15m ago•0 comments

Updating the Golang Memory Model

https://research.swtch.com/gomm
1•fanf2•18m ago•0 comments

Running a 270M LLM on Android (architecture and benchmarks)

1•ayushranjan99•18m ago•0 comments

Bayesian cohort-level ARPU Model

https://world.hey.com/apetrov/bayesian-cohort-level-arpu-model-8647f862
1•apetrov•19m ago•0 comments

Picturing a Voice: Margaret Watts Hughes and the Eidophone

https://publicdomainreview.org/essay/picturing-a-voice-margaret-watts-hughes-and-the-eidophone/
1•bryanrasmussen•20m ago•0 comments

Accepting that you won't know it all

https://blog.prdai.dev/
1•cod1r•26m ago•0 comments

SC25: HACCing over 500 Petaflops on Frontier

https://chipsandcheese.com/p/sc25-haccing-over-500-petaflops-on
1•rbanffy•27m ago•0 comments

How to write a great agents.md: Lessons from over 2,500 repositories

https://github.blog/ai-and-ml/github-copilot/how-to-write-a-great-agents-md-lessons-from-over-250...
2•e2e4•30m ago•0 comments

Show HN: Another JSON Alternative

1•mircerlancerous•34m ago•0 comments

Inflatable Space Stations

https://worksinprogress.co/issue/inflatable-space-stations/
1•angadh•35m ago•1 comments

If the GenAI Bubble Bursts, Nvidia Will Still Keep Growing

https://www.nextplatform.com/2025/11/21/if-the-genai-bubble-bursts-nvidia-will-still-keep-growing/
2•rbanffy•36m ago•2 comments

Become the Consequence

https://randsinrepose.com/archives/become-the-consequence/
1•mooreds•37m ago•0 comments

WorldGen – Text to Immersive 3D Worlds

https://www.meta.com/en-gb/blog/worldgen-3d-world-generation-reality-labs-generative-ai-research/
30•smusamashah•39m ago•13 comments

Pitch Multiplication (2017)

https://klangnewmusic.weebly.com/direct-sound/pitch-multiplication
1•ofalkaed•40m ago•0 comments

Boomtown: Futuristic DE Weapons Research Could Power Albuquerque NM

https://undark.org/2025/11/19/boomtown-albuquerque-directed-energy/
1•transpute•42m ago•0 comments

Analyzing Papers with Nano Banana Pro

https://paper-lens-by-dair-ai-181664986325.us-west1.run.app/
1•omarsar•44m ago•0 comments

User Identity Isn't Complete Without Authorization

https://fusionauth.io/blog/fusionauth-acquires-permify
3•mooreds•46m ago•0 comments

Ask HN: Do developers need to follow every tech update?

2•jerawaj740•48m ago•1 comments

Playtiles – stick-on electronic-free gamepad for phones

https://get.playtil.es/
3•ksymph•48m ago•0 comments

Top WordPress Alternatives

https://www.notwp.com/blog/9-top-wordpress-alternatives-1763844940410
1•bylde•48m ago•0 comments

Kids who own smartphones before age 13 have worse mental health outcomes: Study

https://abcnews.go.com/GMA/Family/kids-smartphones-age-13-worse-mental-health-outcomes/story?id=1...
2•donsupreme•49m ago•0 comments