frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Qwen3 30B A3B Hits 13 token/s on 4xRaspberry Pi 5

https://github.com/b4rtaz/distributed-llama/discussions/255
58•b4rtazz•3h ago

Comments

geerlingguy•59m ago
distributed-llama is great, I just wish it would work with more models. I've been happy with ease of setup and its ongoing maintenance compared to Exo, and performance vs llama.cpp RPC mode.
alchemist1e9•26m ago
Any pointers to what is SOTA for cluster of hosts with CUDA GPUs but not enough vram for full weights, yet 10Gbit low latency interconnects?

If that problem gets solved, even if for only a batch approach that enables parallel batch inference resulting in high total token/s but low per session, and for bigger models, then it would he a serious game changer for large scale low cost AI automation without billions capex. My intuition says it should be possible, so perhaps someone has done it or started on it already.

echelon•41m ago
This is really impressive.

If we can get this down to a single Raspberry Pi, then we have crazy embedded toys and tools. Locally, at the edge, with no internet connection.

Kids will be growing up with toys that talk to them and remember their stories.

We're living in the sci-fi future. This was unthinkable ten years ago.

taminka•23m ago
i feel sorry for your kids if you think this shit is inspiring lol

chagpt is literally leading ppl with higher education to have full on psychosis by feeding into their insane delusions and confirmation bias, im sure a less smart version of this is a perfect toy for a kid w/o a fullt developed brain yet

literally go touch grass bro...

dingdingdang•28m ago
Very impressive numbers.. wonder how this would scale on 4 relatively modern desktop PCs, like say something akin to a i5 8th Gen Lenovo ThinkCentre, these can be had for very cheap. But like @geerlingguy indicates - we need model compatibility to go up up up! As an example it would amazing to see something like fastsdcpu run distributed to democratize accessibility-to/practicality-of image gen models for people with limited budgets but large PC fleets ;)
rthnbgrredf•15m ago
I think it is all well and good, but the most affordable option is probably still to buy a used MacBook with 16/32 or 64 GB (depending on the budget) unified memory and install Asahi Linux for tinkering.

Graphics cards with decent amount of memory are still massively overpriced (even used), big, noisy and draw a lot of energy.

mehdibl•26m ago
1. This is Q4

2. This remain slow

3. The context window used here is likely 8k or similar which makes it unusable for bigger input/output.

Models already work fine on phones just try https://github.com/google-ai-edge/gallery and you will see local AI running on phones fine.

The Smartest Nations: Ranking Intelligence in 2025

https://www.tradingplatforms.co.uk/research/the-worlds-smartest-nations-ranking-intelligence-in-2...
1•fodmap•1m ago•0 comments

More than 10 European startups became unicorns this year

https://techcrunch.com/2025/08/28/more-than-10-european-startups-became-unicorns-this-year/
1•PaulHoule•2m ago•0 comments

Reg attends job interview hosted by AI avatar, struggles to exit uncanny valley

https://www.theregister.com/2025/09/06/ai_job_interview_experience/
1•rntn•3m ago•0 comments

Our speech of many parts

https://www.nationalreview.com/magazine/2025/10/our-speech-of-many-parts/
1•hhs•4m ago•0 comments

Image to Video AI – Transform Images into Videos

https://imagetovideo.me/
1•MxcAlex•4m ago•0 comments

Public toilet locator app just gone viral in Russia

https://neartoilets.com/
1•kevin11111•5m ago•0 comments

Vibe Coding Through the Berghain Challenge

https://www.nibzard.com/berghain/
1•nkko•5m ago•0 comments

BBC News Robots.txt

https://news.bbc.co.uk/robots.txt
1•JKFSOM•9m ago•1 comments

Formality on Demand

https://www.inkandswitch.com/ink/notes/formality-on-demand/
1•jakelazaroff•12m ago•0 comments

DuckDuckGo founder: AI surveillance should be banned

https://gabrielweinberg.com/p/ai-surveillance-should-be-banned
1•mustaphah•14m ago•0 comments

Ask HN: Useful AI applications in regular businesses?

1•dmos62•17m ago•0 comments

Optical Generative Models

https://www.nature.com/articles/s41586-025-09446-5
2•bookofjoe•20m ago•0 comments

A hidden simplicity behind how people move: geography's role in relocation

https://phys.org/news/2025-09-hidden-simplicity-people-reveals-geography.html
1•XzetaU8•21m ago•0 comments

Mr. Tompkins in Wonderland" – Space, Time and Relativity [video]

https://archive.org/details/67004-mr-tompkins-in-wonderland-vwr
1•the-mitr•23m ago•0 comments

Show HN: NeKernel v0.0.5

https://github.com/nekernel-org/nekernel
1•Amlal•24m ago•0 comments

How to Think about Surveillance

https://www.ft.com/content/9e7372b7-002e-41db-823c-7a70ab8d888d
1•gm678•26m ago•1 comments

Go tool that sorts methods by call graph analysis

https://pkg.go.dev/github.com/borovikovd/gomsort
1•dborovikov•29m ago•1 comments

Upcoming HHS report will link autism to common pain reliever, folate deficiency

https://www.cnn.com/2025/09/05/health/hhs-report-autism-folate-acetaminophen
1•ValentineC•31m ago•0 comments

Wayland. (Budgie Desktop, 2023)

https://buddiesofbudgie.org/blog/wayland
1•Bogdanp•33m ago•0 comments

NY Times: Why Are More Millionaires Renting?

https://www.nytimes.com/2025/09/04/realestate/millionaire-renters-homeownership.html
1•hedgehog0•34m ago•3 comments

How Binary JSON Works in YDB (2022)

https://laplab.me/posts/how-binary-json-works-in-ydb/
1•GalaxySnail•35m ago•0 comments

The Agentic Systems Series

https://gerred.github.io/building-an-agentic-system/index.html
2•vinhnx•38m ago•0 comments

Ask HN: How do you stop teams from overestimating sprint capacity?

1•harryosgood•39m ago•4 comments

Victoria Woodhull – The first woman to run for president

https://en.wikipedia.org/wiki/Victoria_Woodhull
1•leopoldj•43m ago•0 comments

Gates-Backed Study: Flu Shots Linked to 27% Higher Heart Injury Risk

https://modernity.news/2025/09/04/gates-backed-study-flu-shots-linked-to-27-higher-heart-injury-r...
2•bilsbie•45m ago•1 comments

996

https://lucumr.pocoo.org/2025/9/4/996/
6•genericlemon24•46m ago•0 comments

Scale AI former CTO launches AI agent that could solve big data biggest problem

https://techcrunch.com/2025/09/05/scale-ais-former-cto-launches-ai-agent-that-could-solve-big-dat...
1•jimexp69•47m ago•0 comments

Ask HN: Why do LLMs struggle with word count?

1•rishikeshs•48m ago•1 comments

A16Z scouting ambitious Swiss founders for $1M accelerator

2•boggsss•49m ago•0 comments

Do Hangover Supplements Work?

https://www.economist.com/science-and-technology/2025/09/05/do-hangover-supplements-work
1•Anon84•50m ago•1 comments