frontpage.

Hey everyone, I built llm-schema-guard because LLMs are amazing at spitting out JSON... until they suddenly aren't. Even with JSON mode or function calling, you still get missing fields, wrong types, or just plain broken syntax that kills your agents, RAG flows, or any tool-calling setup. This is a lightweight Rust HTTP proxy that sits in front of any OpenAI-compatible API (think Ollama, vLLM, LocalAI, OpenAI itself, Groq, you name it). It grabs the generated output, checks it against a JSON Schema you provide, and only lets it through if it's valid. If it's invalid, strict mode kicks back a clean 400 with details. Permissive mode tries auto-retrying a few times by tweaking the prompt with a fix instruction and exponential backoff. Everything else stays the same: full streaming support (it buffers the response to validate), Prometheus metrics so you can monitor validation fails, retries, latency, and more. Config is simple YAML for upstreams, schemas per model, rate limiting, caching, etc. There's even an offline CLI if you just want to test schemas locally. It's built with Axum and Tokio for really low latency and high throughput, plus jsonschema-rs under the hood. Docker compose makes it dead simple to spin up with Ollama.

This grew out of my earlier schema-gateway project, and I'm happy to add stuff like Anthropic support, tool calling validation, or better streaming fixes if people find it useful. Stars or contributions are very welcome!

Thanks for taking a look :)

Ask HN: AI Generated Diagrams

Microsoft Account bugs locked me out of Notepad – are Thin Clients ruining PCs?

Show HN: A delightful Mac app to vibe code beautiful iOS apps

Show HN: Gemini Station – A local Chrome extension to organize AI chats

Welfare states build financial markets through social policy design

Market orientation and national homicide rates

California urges people avoid wild mushrooms after 4 deaths, 3 liver transplants

Matthew Shulman, co-creator of Intellisense, died 2019 March 22

Show HN: SuperLocalMemory – AI memory that stays on your machine, forever free

Show HN: Pyrig – One command to set up a production-ready Python project

Fast Response or Silence: Conversation Persistence in an AI-Agent Social Network [pdf]

C and C++ dependencies: don't dream it, be it

Show HN: Vbuckets – Infinite virtual S3 buckets

Open Molten Claw: Post-Eval as a Service

New York Budget Bill Mandates File Scans for 3D Printers

The End of Software as a Business?

Exploring 1,400 reusable skills for AI coding tools

Show HN: A unique twist on Tetris and block puzzle

The logs I never read

How to use AI with expressive writing without generating AI slop

Show HN: LinkScope – Real-Time UART Analyzer Using ESP32-S3 and PC GUI

Cppsp v1.4.5–custom pattern-driven, nested, namespace-scoped templates

The next frontier in weight-loss drugs: one-time gene therapy

At Age 25, Wikipedia Refuses to Evolve

Show HN: ReviewReact – AI review responses inside Google Maps ($19/mo)

Why AlphaTensor Failed at 3x3 Matrix Multiplication: The Anchor Barrier

Ask HN: How much of your token use is fixing the bugs Claude Code causes?

Show HN: Agents – Sync MCP Configs Across Claude, Cursor, Codex Automatically

Hello

FSD helped save my father's life during a heart attack