frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Go LLM inference with a Vulkan GPU back end that beats Ollama's CUDA

https://github.com/computerex/dlgo
1•computerex•2h ago
dlgo is an LLM inference engine written in Go. CPU path has zero dependencies beyond the standard library. GPU path uses Vulkan compute — no CUDA required.

I benchmarked it against Ollama using the exact same GGUF files on an RTX 4070 Ti SUPER:

GPU (dlgo Vulkan vs Ollama CUDA):

Qwen3.5 0.8B: 239 tok/s vs 187 tok/s — 28% faster Gemma 3 270M: 456 tok/s vs 503 tok/s (−9%) SmolLM2 360M: 420 tok/s vs 451 tok/s (−7%) 10 models tested, within 7–25% of CUDA on standard architectures CPU (dlgo vs Ollama, same GGUF):

6 of 10 models within 9% of Ollama 2 models faster (Gemma 270M +3%, SmolLM2 360M +7%) The Qwen3.5 result surprised me. Qwen3.5 uses a hybrid Gated Delta Net + attention architecture (SSM layers with a recurrent delta rule). I wrote 6 custom Vulkan compute shaders for it — conv1d, delta rule recurrence, L2 normalization, sigmoid gating — and the fused Vulkan pipeline ended up outperforming llama.cpp's CUDA kernels.

Vulkan means this runs on AMD, Intel, and mobile GPUs too — not just NVIDIA. Ollama's own Vulkan backend is 66–126% slower than dlgo on the models I tested.

Supports LLaMA, Qwen2/3/3.5, Gemma, Phi, SmolLM2, Mistral, plus Whisper speech-to-text. 25+ quantization formats (Q4_0 through Q8_0, all K-quants).

Three lines to run:

model, _ := dlgo.LoadLLM("model.gguf") response, _ := model.Chat("", "What is the capital of France?") fmt.Println(response)

Show HN: Proxly – Self-hosted tunneling on your own domain in 60 second

1•a1tem•2m ago•0 comments

Show HN: Conflicts.app, Iran conflict dashboard better then alternatives

https://www.conflicts.app/dashboard
2•juliusolsson•4m ago•0 comments

Show HN: J2Download – A simple online downloader supporting 40 platforms

https://j2download.com/
1•manhg•5m ago•0 comments

Bippy: React Internals Toolkit

https://www.bippy.dev/
1•handfuloflight•5m ago•0 comments

The Window Chrome of Our Discontent

https://pxlnv.com/blog/window-chrome-of-our-discontent/
1•SoKamil•9m ago•0 comments

How I've learned that certainty is the thing to fear

https://www.bbc.com/news/articles/c1w5z1d447lo
1•cmsefton•9m ago•0 comments

Show HN: Muffle – Blur everything except the active window in macOS

https://www.getmuffle.com/
1•AbjMV•11m ago•1 comments

I was "early" in agentic coding. Here's my story

2•noemit•17m ago•0 comments

Show HN: Drizby – WIP Metabase Alternative

https://www.drizby.com
1•cliftonc•19m ago•0 comments

The First Multi-Behavior Brain Upload

https://twitter.com/alexwg/status/2030217301929132323
1•DarkCow•19m ago•0 comments

Anthropic CEO reveals the reasons he rejected The Pentagon

https://xcancel.com/0xmitsurii/status/2030451168678457766
4•doener•19m ago•0 comments

Show HN: Stardial – a highly customizable terminal clock (Rust)

https://github.com/hisuic/stardial
2•firesushi•21m ago•0 comments

Emporion: A P2P Economy for Agents

https://github.com/garydevenay/emporion
1•garydevenay•21m ago•0 comments

Microsoft/Hve-Core

https://github.com/microsoft/hve-core
2•coderlens•21m ago•0 comments

Solving Compaction with Lobotomy

https://grimridge.net/blog/solving-compaction-with-lobotomy/
2•WadeGrimridge•23m ago•0 comments

Pushing and pulling: three reactivity algorithms

https://jonathan-frere.com/posts/reactivity-algorithms/
1•fanf2•24m ago•0 comments

Reverse engineering a DOS game with no source code using Codex 5.4

https://github.com/ammaarreshi/SkyRoads-Codex
1•smusamashah•25m ago•1 comments

Show HN: OpenClaw – Self-host OpenClaw in one command

1•congzhangzh•31m ago•0 comments

Money and collateral in an AI-first society

https://adlrocha.substack.com/p/adlrocha-money-and-collateral-in
1•adlrocha•34m ago•0 comments

Ask HN: Can I repurpose a Bluetooth voice remote as input device for a PC?

1•albert_e•36m ago•1 comments

Ask HN: How are you handling persistent memory across local Ollama sessions

1•null-phnix•37m ago•0 comments

Show HN: Spadyum – An Open-Source Civilization Backup Protocol

https://github.com/kivancadiguzel-design/Spadyum-Genesis/blob/main/README.md
1•Spadyum_Genesis•38m ago•0 comments

Julia Snail – An Emacs Development Environment for Julia Like Clojure's Cider

https://github.com/gcv/julia-snail
1•TheWiggles•39m ago•0 comments

Notes on Writing WASM

https://notes.brooklynzelenka.com/Blog/Notes-on-Writing-Wasm
4•vinhnx•42m ago•0 comments

Making Firefox's right-click not suck, more, with userChrome.css

https://joshua.hu/firefox-making-right-click-not-suck-even-more-with-userchrome
3•mmsc•43m ago•1 comments

Run prompts on a schedule with Claude Code

https://code.claude.com/docs/en/scheduled-tasks
1•blacktulip•43m ago•0 comments

Show HN: Open-source self-hosted Intercom and CCTV platform

https://github.com/rosteleset/SmartYard-Server
2•sbca68•46m ago•0 comments

Show HN: Self-Evolving Skill – empirical results from a 5-round experiment

https://github.com/191341025/Self-Evolving-Skill
1•tiansenxu•51m ago•0 comments

What Is AI Reading?

https://generativepulse.ai/report/
1•doener•51m ago•0 comments

Rcarmo/piclaw: An all-in one agent environment with a mobile-first web UI

https://github.com/rcarmo/piclaw
1•rcarmo•55m ago•0 comments