frontpage.

Hi HN,

We built Xybrid, a Rust library for running LLM + speech pipelines directly inside your app, no server, no daemon, just one binary.

We started building it while working on a privacy-focused LLM app with Tauri and realized there wasn’t a straightforward way to embed models directly into shipped applications without relying on a separate server process.

Xybrid links into your process like any other library. It supports GGUF / ONNX / CoreML and integrates with Flutter, Swift, Kotlin, Unity, and Tauri, letting you run pipelines like speech → LLM → speech in a single call.

On recent phones, we’re seeing ~20 tok/s on Android and ~40 tok/s on iOS for small (~3B) quantized models (varies by device, backend, and thermals).

The demo that shows it best: a Unity tavern scene where 6 NPCs generate real-time dialogue fully on-device — no API key, no internet, no per-request cost.

Unity demo: https://youtu.be/vSPeTyeow6A Desktop demo (Tauri): https://youtu.be/o83YShqV7O4

GitHub: https://github.com/xybrid-ai/xybrid

It’s still early — there are rough edges, especially around model support and performance tuning. Happy to answer questions about the architecture, backends, or integrations (Flutter, Swift, Kotlin, Unity, Tauri).

Show HN: Will my flight have Starlink?

I dug into the Flipper One's firmware, and it's a pocket Linux PC

SecDB

WhyChurn – Turn cancellations into actionable feedback

Iran's South Pars Gas Field Is Attacked by Israel, Sending Energy Prices Soaring

As AI keeps improving, mathematicians struggle to foretell their own future

Management Craft: A Talking Management Library

FFC: Social Media Without Algorithms, Moderators, or Special People

FifthForceFramework

Scaling Btrfs to petabytes in production: a 74% cost reduction story

Assign tasks to Claude from anywhere in Cowork

How HN: Ironkernel – Python expressions, Rust parallel

AI Coding Is Gambling

I Gave My AI Agent $25 and Told It to Buy Me a Gift

Cadillac's F1 journey: 'Our Silverstone shakedown was a miracle'

OpenCLI – control Electron and web apps from the CLI or AI agents

Nvidia's Always-On Chip Detects Faces in Less Than a Millisecond

Ask HN: Does Window's "Set Time Automatically" feature require the correct time?

The trillion-dollar African payments network nobody in tech noticed

The Pentagon is planning for AI companies to train on classified data

Parent-Managed Accounts on WhatsApp

OpenAI Parameter Golf

PDP11 Simulator Written in APL

S4: The Bob Lazar Story – Release Trailer (2026) [video]

Show HN: Mozilla Firefox is getting a free built-in VPN, with a catch

Representing financial data as still life paintings

AI Made My Team Write 21% More Code. The Review Queue Doubled

OpenSWE

Swiftkey will soon require a Microsoft account – data to be moved to OneDrive

Inside the Rage Machine

Show HN: Xybrid – run LLM and speech locally in your app (no back end, Rust)