frontpage.

Hey HN,

Over the last few months, I noticed a massive problem: developers (including me) are lazy. We were sending every single prompt—even basic JSON extractions—to GPT-4o or Claude 3.5 Sonnet, and my API bills were sky rocketing

Because of this I built an AI gateway to fix this. It acts as a drop-in replacement for your OpenAI endpoint. When a request comes in, a tiny, fast classifier scores the prompt's complexity in a few milliseconds. It switches which LLM to use based on it's prompt complexity and cuts your costs by around 30%.

Simple extraction and formatting: Routes to Llama 3 8B or Gemini Flash (costs almost nothing).

Complex reasoning: Routes to GPT-4o/Claude (costs dollars).

Semantic Cache: If the exact same question was asked 5 minutes ago, it serves the cached response instantly (costs zero).

I'd love feedback on this. Happy to answer any questions!

Are We Engineers?

I replaced grep-based code exploration with a knowledge graph – 10x less token

How to protect your privacy at a protest

The digital grass isn't greener. It isn't grass

Show HN: I built a skill that lets your OpenClaw call you on the phone

Book Notes: Anything you want (Derek sivers)

Iran Is Only the Beginning

Show HN: SEL Deploy – Tamper-evident deployment timeline (Ed25519, hash-chained)

Show HN: Scanning 277 AI agent skills for security issues

Why glibc is faster on some GitHub Actions Runners

Show HN: A text-to-motion-graphics engine

Federal Reserve ACH System Is Down

Show HN: MoodJot – Mood tracker mobile app with community feed, built with KMP

Show HN: A visual sitemap generator for planning site structure

Biosynthetic platform for orsellinic acid-derived meroterpenoids in E. coli

Agentic RL hackathon this weekend in SF

Show HN: TeamTalk – Instead of asking one AI, let a whole team debate it

Show HN: I made an AI Agent to dig everything out of your CSV

Show HN: Pry – TypeScript compiled to native code, no Electron or V8

Show HN: Eolds, a scanner for EOL open source packages across 12M versions

First Impressions on Open-Source Claude Security (Strix)

The Eythos Vision: AI Companions as a Human Right

Why No AI Games?

Show HN: Orkia – a Rust runtime where AI agents can't bypass governance

Show HN: Flashbang – Sub-1ms DuckDuckGo bang redirects via Service Workers

Metaprogramming for Madmen (2012)

Eshkere

Is your site agent-friendly?

Combinatorial Optimization for All: Using LLMs to Aid Non-Experts

Show HN: Pooch PDF – Because Ctrl+P still prints cookie banners in 2026

A proxy that cuts LLM API bills by routing simple tasks to cheaper models