Show HN: Plano – Edge and service proxy with orchestration for AI agents

https://github.com/katanemo/plano

8•adilhafeez•1mo ago

Hey HN — I’m Adil from Katanemo (with Salman, Shuguang, and Meiyu)

We previously shared an early version of this project as ArchGW. Based on customer feedback, the scope expanded from “LLM routing and model access” into something broader: delivery infrastructure for agentic applications. We renamed it to Plano and reworked the architecture accordingly.

The problem

On-the-ground AI practitioners will tell you that calling an LLM is not the hard part. The really hard part is delivering agentic applications to production quickly and reliably, then iterating without rewriting system code every time. In practice, teams keep rebuilding the same concerns that sit outside any single agent’s core logic:

This includes model agility — the ability to pull from a large set of LLMs and swap providers without refactoring prompts or streaming handlers. They need to learn from production by collecting signals and traces that tell them what to fix. They need consistent policy enforcement for moderation and jailbreak protection, rather than sprinkling hooks across codebases. And they need multi-agent patterns like handoff and specialization without turning their app into orchestration glue.

These concerns get rebuilt and maintained inside fast-changing frameworks and application code, coupling product logic to infrastructure decisions. It’s brittle, and pulls teams away from core product work into plumbing they shouldn’t have to own.

What Plano does

Plano moves core delivery concerns out of process into a modular proxy and dataplane designed for agents. It supports inbound listeners (agent orchestration, safety and moderation hooks), outbound listeners (hosted or API-based LLM routing), or both together.

Plano provides the following capabilities via a unified, protocol-native, framework-friendly dataplane:

- Orchestration: Low-latency routing and handoff between agents. Add or change agents without modifying app code, and evolve strategies centrally instead of duplicating logic across services.

- Guardrails & Memory Hooks: Apply jailbreak protection, content policies, and context workflows (rewriting, retrieval, redaction) once via filter chains. This centralizes governance and ensures consistent behavior across your stack.

- Model Agility: Route by model name, semantic alias, or preference-based policies. Swap or add models without refactoring prompts, tool calls, or streaming handlers.

- Agentic Signals™: Zero-code capture of behavior signals, traces, and metrics across every agent, surfacing traces, token usage, and learning signals in one place.

The goal is to keep application code focused on product logic while Plano owns delivery mechanics.

Comments

dang•1mo ago

> We previously shared an early version of this project as ArchGW

Looks like these are the previous threads, if anyone's curious:

Show HN: ArchGW – An intelligent edge and service proxy for agents - https://news.ycombinator.com/item?id=44546265 - July 2025 (15 comments)

Show HN: ArchGW – An open-source intelligent proxy server for prompts - https://news.ycombinator.com/item?id=43259862 - March 2025 (7 comments)

Show HN: archgw: open-source, intelligent proxy for AI agents, built on Envoy - https://news.ycombinator.com/item?id=42187132 - Nov 2024 (14 comments)

paidev•4w ago

Cool project—Envoy dataplane + tiny routing LLMs is a solid combo for agent handoffs without the usual orchestration bloat.

We've wrestled with similar delivery pains building MCP tools for agents (Claude/ChatGPT/Cursor all love 'em now). Proxies like yours shine for LLM routing, but tool backends often drag with auth/setup. Your MCP integration in filters caught my eye—@leanmcp/auth decorator drops proper JWT validation (Cognito/Auth0/etc.) to 20 lines vs 600+ raw, auto-injects user context everywhere. (Disclosure: co-founder on LeanMCP.)

How's the Brightstaff fallback to static policies holding up in prod? Happy to chat agent war stories.

Show HN: A luma dependent chroma compression algorithm (image compression)

Show HN: PalettePoint – AI color palette generator from text or images

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Show HN: Kappal – CLI to Run Docker Compose YML on Kubernetes for Local Dev

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: If you lose your memory, how to regain access to your computer?

Show HN: I spent 4 years building a UI design tool with only the features I use

Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"

Show HN: Smooth CLI – Token-efficient browser for AI agents

Show HN: R3forth, a ColorForth-inspired language with a tiny VM

Show HN: Stacky – certain block game clone

Show HN: A toy compiler I built in high school (runs in browser)

Show HN: Artifact Keeper – Open-Source Artifactory/Nexus Alternative in Rust

Show HN: BioTradingArena – Benchmark for LLMs to predict biotech stock movements

Show HN: Slack CLI for Agents

Show HN: Nginx-defender – realtime abuse blocking for Nginx

Show HN: ARM64 Android Dev Kit

Show HN: Gigacode – Use OpenCode's UI with Claude Code/Codex/Amp

Show HN: MCP App to play backgammon with your LLM

Show HN: I'm 75, building an OSS Virtual Protest Protocol for digital activism

Show HN: I built Divvy to split restaurant bills from a photo

Show HN: Micropolis/SimCity Clone in Emacs Lisp

Show HN: Horizons – OSS agent execution engine

Show HN: Which chef knife steels are good? Data from 540 Reddit tread

Show HN: Daily-updated database of malicious browser extensions

Show HN: I Hacked My Family's Meal Planning with an App

Show HN: Slop News – HN front page now, but it's all slop

Show HN: Falcon's Eye (isometric NetHack) running in the browser via WebAssembly

Show HN: I built a free UCP checker – see if AI agents can find your store

Show HN: Local task classifier and dispatcher on RTX 3080