frontpage.

We built a production-grade multimodal inference server in Rust for serving vision–language models (image + text → streamed text).

The goal was to explore what a Rust-native control plane looks like for modern multimodal inference: continuous batching, KV-aware admission control, predictable behavior under load, and proper streaming semantics.

The system exposes an OpenAI-compatible API, supports multi-image inputs, and is designed to degrade gracefully under overload rather than OOM or stall. It’s organized as a single monorepo with a gateway, GPU workers, scheduler, and pluggable engine adapters.

We’ve also included a benchmark suite focused on real-world scenarios (TTFT, cancellation, overload, fairness) rather than synthetic tokens/sec numbers.

Would love feedback from folks building or operating inference infrastructure.

Nintendo Wii Themed Portfolio

"There must be something like the opposite of suicide "

Ask HN: Why doesn't Netflix add a “Theater Mode” that recreates the worst parts?

Show HN: Engineering Perception with Combinatorial Memetics

Show HN: Steam Daily – A Wordle-like daily puzzle game for Steam fans

The Anthropic Hive Mind

Just Started Using AmpCode

LLM as an Engineer vs. a Founder?

Crosstalk inside cells helps pathogens evade drugs, study finds

Show HN: Design system generator (mood to CSS in <1 second)

Show HN: 26/02/26 – 5 songs in a day

Toroidal Logit Bias – Reduce LLM hallucinations 40% with no fine-tuning

Top AI models fail at >96% of tasks

The Science of the Perfect Second (2023)

Bob Beck (OpenBSD) on why vi should stay vi (2006)

Show HN: a glimpse into the future of eye tracking for multi-agent use

The Optima-l Situation: A deep dive into the classic humanist sans-serif

Barn Owls Know When to Wait

Implementing TCP Echo Server in Rust [video]

LicGen – Offline License Generator (CLI and Web UI)

Service Degradation in West US Region

The Janitor on Mars

Bringing Polars to .NET

Adventures in Guix Packaging

Show HN: We had 20 Claude terminals open, so we built Orcha

Your Best Thinking Is Wasted on the Wrong Decisions

Warcraftcn/UI – UI component library inspired by classic Warcraft III aesthetics

Trump Vodka Becomes Available for Pre-Orders

Velocity of Money

Stop building automations. Start running your business