frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

I built a pure WGSL LLM engine to run Llama on my Snapdragon laptop GPU

https://github.com/Beledarian/wgpu-llm
1•Beledarian•1h ago

Comments

Beledarian•1h ago
Hi HN,

I recently bought a Snapdragon X Elite Copilot+ laptop and realized my integrated Adreno GPU was basically a paperweight for local AI. Standard tools like LM Studio and the massive PyTorch ecosystem didn't support it, forcing everything onto the CPU. I didn't want to wait for the ecosystem to catch up, so I built a from-scratch inference engine to bypass it entirely.

It’s written purely in Rust and WGSL. No CUDA, no Python, no heavy frameworks. Just raw compute shaders dispatching the Transformer forward pass, making it portable (runs on Windows, macOS, Linux via Vulkan/Metal/DX12). Currently, I'm getting ~33 tok/s on the Snapdragon Adreno (around ~25 with fp16) and 66+ tok/s (fp16/fp32) on an RTX 3090 with TinyLlama.

The build process: I actually had a dual motivation here. Beyond solving my hardware gap, I wanted a stress test for my own LLM orchestration tools. A Transformer engine requires exact math, strict buffer layouts (those WebGPU vec3 alignment traps are real), and standalone compute shaders there is zero room for AI hallucination. I spent the time developing and validating a strict architectural blueprint up front. Then, using highly specific prompts, strict behavior guidance, and my custom MCP tools to feed the AI the exact WGSL specs, I successfully scaffolded that predefined human architecture into working code in under 16 hours.

It is very much alpha software. It's decode-only, single-sequence, and currently uses CPU-side sampling. I’d love to hear your thoughts, especially from anyone with deep WGSL/WebGPU experience regarding buffer layouts or optimizing the INT8 GEMM paths (I know I need to move to a tiled implementation to get around the VRAM bandwidth bottleneck).

Happy to answer any questions about the architecture or the build process!

Repo: https://github.com/Beledarian/wgpu-llm

The Problem That Built an Industry

https://ajitem.com/blog/iron-core-part-1-the-problem-that-built-an-industry/
1•ShaggyHotDog•37s ago•0 comments

LinkedIn Pulse Lost 85% of Its Organic Traffic in the Last Two Years

https://growtika.com/blog/linkedin-pulse-research
1•Growtika•1m ago•0 comments

In Defense of Rediscovery

https://wilsoniumite.com/2026/04/11/in-defense-of-rediscovery/
1•Wilsoniumite•3m ago•0 comments

Framechart – Turn CSV data into animated chart videos

https://framechart.com
1•Don_Data•6m ago•0 comments

Can OpenClaw and Claude be better than therapy?

https://world.hey.com/cassio/openclaw-claude-are-better-than-therapy-e0ac3ad9
2•cacozen•7m ago•1 comments

Show HN: Helix – open-source self-healing back end for production crashes

https://88hours.github.io/helix-community/
1•NomiJ•7m ago•0 comments

Iran War and the great reset with Katherine Austin Fitts [video][1hr]

https://www.youtube.com/watch?v=Y7JdMLITSDU
1•Bender•7m ago•0 comments

America Has a New GLP-1 Playbook

https://www.theatlantic.com/health/2026/04/glp-1-pill-wegovy-weight-loss/686768/
1•01-_-•9m ago•0 comments

Overhead Projector

https://en.wikipedia.org/wiki/Overhead_projector
2•zeristor•9m ago•2 comments

Key Person Quest Launching

https://keyperson.quest
1•Londondannyboy•12m ago•0 comments

Nadir: Open-source LLM router that cuts API costs 30-60% (MIT License)

https://getnadir.com/
2•amirdor•12m ago•0 comments

Show HN: Hands-on course for building RL environments for LLMs

https://github.com/anakin87/llm-rl-environments-lil-course
1•anakin87•13m ago•1 comments

Show HN: Superpowers-UML – UML-Enabled Superpowers

https://github.com/takaakit/superpowers-uml
1•takaakit•14m ago•0 comments

Steam Link Expands to Apple Vision Pro in Beta

https://www.tuaw.com/2026/04/11/steam-link-expands-to-apple-vision-pro-in-beta/
2•zeristor•19m ago•0 comments

United's Unique Hub in the Pacific

http://www.flightsinasia.com/update/article/Uniteds-Unique-Hub-in-the-Pacific/
1•kevmo314•21m ago•0 comments

Show HN: Waffle – Native macOS terminal that auto-tiles sessions into a grid

https://waffle.baby
1•olleeolleeollee•23m ago•0 comments

How do the microplastics in our bodies affect our health?

https://www.bbc.com/future/article/20250723-how-do-the-microplastics-in-our-bodies-affect-our-health
1•strogonoff•23m ago•0 comments

Show HN: The Musical Manifold [pdf]

https://esenbilproductions.replit.app/The_Musical_Manifold.pdf
1•ersinesen•26m ago•0 comments

Compact Compact Language Detector

https://www.andriydruk.com/post/compact-compact-language-detector/
1•andriydruk•29m ago•0 comments

Apollo in Real Time

https://apolloinrealtime.org/11/
1•rvnx•30m ago•0 comments

MySQL 9.7.0 vs. sysbench on a small server

http://smalldatum.blogspot.com/2026/04/mysql-970-vs-sysbench-on-small-server.html
1•gsky•36m ago•0 comments

South Korea introduces universal basic mobile data access

https://www.theregister.com/2026/04/10/south_korea_data_access_universal/
12•saikatsg•36m ago•1 comments

Slides (Hypnotic Video About a Dude's Slides and Slide Projector)

https://www.youtube.com/watch?v=hZhMAtHoU20
1•OhMeadhbh•38m ago•1 comments

Plannex

https://plannex.app/
1•Novakinify•39m ago•0 comments

Spooky-connect4: a Rust/Python library with variable board sizes

https://github.com/snowdrop4/spooky-connect4
1•drw•39m ago•0 comments

Spooky-chess: a Rust/Python library with variable board sizes

https://github.com/snowdrop4/spooky-chess
1•drw•40m ago•0 comments

Bitcoin miners are losing $19,000 on every BTC produced as difficulty drops 7.8%

https://www.coindesk.com/markets/2026/03/22/bitcoin-miners-are-losing-usd19-000-on-every-btc-prod...
48•PaulHoule•41m ago•28 comments

TraceFix – Paste a Linux/SSH log error, get the root cause and exact fix command

https://tracefix.vercel.app/
1•skillsettler•41m ago•0 comments

Cotypist

https://cotypist.app/
1•saikatsg•42m ago•0 comments

Shipped a 66-ticket Architecture Epic autonomously with a new Coding Agent setup

https://widal.substack.com/p/we-shipped-a-66-ticket-architecture
2•niwid•42m ago•0 comments