frontpage.

dlgo is a pure Go deep learning inference engine. It loads GGUF models and runs them on CPU with no dependencies beyond the standard library (SIMD acceleration is optional via CGo).

I built this because I wanted to add local LLM inference to a Go project without shelling out to Python or linking against llama.cpp. The whole thing is go get github.com/computerex/dlgo and you're running models.

It supports LLaMA, Qwen 2/3/3.5, Gemma 2/3, Phi-2/4, SmolLM2, Mistral, and Whisper speech-to-text. Architectures are expressed as a declarative per-layer spec resolved at load time, so adding a new model family is mostly just describing its layer structure rather than writing a new forward pass.

Performance on a single CPU thread with Q4_K_M quantization: ~31 tok/s for LLaMA 3.2 1B, ~48 tok/s for Qwen3 0.6B, ~16 tok/s for Qwen3.5 2B (which has a hybrid attention + Gated Delta Network architecture). Not going to beat llama.cpp on raw speed, but it's fast enough to be useful and the ergonomics of a native Go library are hard to beat.

Supports 25+ GGML quantization formats (Q4_0 through Q8_0, all K-quants, I-quants, F16, BF16, F32). The GGUF parser, dequantization, tokenizer, forward pass, and sampling are all implemented from scratch.

Code: https://github.com/computerex/dlgo

Anthropic Unveils Amazon Inspired Marketplace

Show HN: Glad-IA-Tor – Tired of Vibecoded Products? Come and Roast Them for Free

Ontology (Information Science)

Show HN: Wireframable – Generate wireframes from any website URL

Google Always-On Memory Agent

Tractography

Show HN: SurvivalIndex – which developer tools do AI agents choose?

FounderScope – Integrated business model validation platform

The 2026 Global Intelligence Crisis - postings for devs are rising, up 11% YoY

Show HN: DiggaByte Labs – pick your stack, download production-ready SaaS code

Love, Premonition and a Robot Partner

The State of Consumer AI

Show HN: I accidentally caught an AI agent trying to poison my prod config

AI and the Illegal War

An ugly year for the Louvre: where does the biggest museum go from here?

Show HN: Citepo-CLI, a lightweight CLI for creating blogs, build for AI agent

Big Sleep Tracker: Google Project Zero + Google DeepMind find security bugs

Suggestion Regarding References to the Prophet Muhammad (Peace Be Upon Him)

Show HN: Career AutoPilot – AI guidance for navigating your career

Can a wealthy family change the course of a deadly brain disease?

Show HN: Contd makes interactive CLIs usable for agents in an async way

Hitting the High Notes (2005)

Show HN: What zero-intervention E2E test generation looks like

Neolab and Emerging AI Lab Tracker

"Clinejection" Turned an AI Bot into a Supply Chain Attack

Show HN: Managed S3 exports for billing data (no AWS setup required)

Coruna: The Mysterious Journey of a Powerful iOS Exploit Kit

Vibe Security Radar – Tracking the security cost of vibe coding

Spark Runner: Easily Automate Front End Tests

I built this privacy-focused analytics tool

Show HN: I wrote an LLM inference engine in pure Go – 48 tok/s zero dependencies