Ask HN: Is it feasible to run a model on device for complete privacy?

3•mazinz•1h ago

Tried Gemma, Qwen and a few others. Need vision and larger context windows for an application I am working on. Results were quite poor Gemma 4E2B probably the best of the ones I tired but still fell apart and keep hallucinating with ~5000 tokens. Cloud based models had no problems even even Gemini 3.1 Flash-Lite and GPT-5.4 mini do a lot better and a way faster.

Comments

benoau•1h ago

It's technically feasible, really just a question of whether this is worth $10,000(s) to you and you're willing to spend it.

mazinz•1h ago

Why financially crippling? It’s free to run on device. The native Apple Intelligence works well for smaller context windows and text only.

benoau•38m ago

You can get poor results "for free" from your laptop, but the devices you need for the large models are very expensive.

mc7alazoun•1h ago

Feasible but too expensive! I get that privacy is a priority for you but unfortunately if you want quality models you'd still have to maybe use frontier closed models..

mazinz•1h ago

No open source model that’s any good?

vitalyan1234•55m ago

the Gemma you tried is tiny, there are 31B and 26B (A4B) variants. there's also Qwen 3.6 with 27B and 35B (A3B) variants, reportedly pretty good. try them on open router or something. these require 30-40 Gb of memory to run between RAM and VRAM, less if quantized beyond near-lossless 8 bit.

there are near-SOTA open models, but they are 1T+ parameters, i.e. they require over a terabyte of memory to run.

Thinking and Explaining Mathematics (2010)

Marketing Clerks

Scott Pelley on the Bari Weiss Era and His Last Days at '60 Minutes'

Show HN: A parser for the ISO 10303 EXPRESS language for its 40th anniversary

Shwo HN: Roadbar – Gantt-style Jira deadline tracker

Gdf: Git Diff Merge

We moved our growth analytics back into Tinybird

Show HN: Achu.app – turns raw captures into polished visual, with AI Issue Agent

Trump doesn't rule out giving Jan. 6 rioters who attacked police payouts

Git: The Fabric of Software

Vitamin D3 During Pregnancy and Cognitive Performance at 10 Years

Our first customers were the exception

Show HN: YouTube Roulette – one button, a random video

AI Native DevCon, London, June 2026 [YouTube Playlist]

Giving AI SSH Access

Is This Art?

Kids Still Ride Horses to This One-Room Wyoming Schoolhouse

Two Years of Enhanced Weathering in Tropical Cacao Agri-Ecosystems

Brit maritime agency heralds fresh global rules for crewless cargo ships

School shooting survivor sues AI gun detection firm failed to spot weapon

Electric TaxiBot starts operating at Schiphol

Petals Around the Rose

The language debate is back

A simple (and free) way to delete all your discord messages in a server or DM

The OnlyFans Economy of American AI

TvOS 27 is Apple's chance to fix Apple TV gaming

Best explanations of how LLMs work

Dozzle: Real-time Docker logs, stats, and debugging – in the browser

Side hustle ideas that work

Moscow Hints at Potential US-Russia Tunnel Deal