Show HN: ArchGW – an intelligent edge and service proxy for agents

35•honorable_coder•1d ago

Hey HN!

This is Adil, Salman and Jose and and we’re behind archgw [1]. An intelligent proxy server designed as an edge and AI gateway for agents - one that natively know how to handle prompts, not just network traffic. We’ve made several sweeping changes so sharing the project again.

A bit of background on why we’ve built this project. Building AI agent demos is easy, but to create something production-ready there is a lot of repeat low-level plumbing work that everyone is doing. You’re applying guardrails to make sure unsafe or off-topic requests don’t get through. You’re clarifying vague input so agents don’t make mistakes. You’re routing prompts to the right expert agent based on context or task type. You’re writing integration code to quickly and safely add support for new LLMs. And every time a new framework hits the market or is updated, you’re validating or re-implementing that same logic—again and again.

Putting all the low-level plumbing code in a framework gets messy to manage, harder to update and scale. Low-level work isn't business logic. That’s why we built archgw - an intelligent proxy server that handles prompts during ingress and egress and offers several related capabilities from a single software service. It lives outside your app runtime, so you can keep your business logic clean and focus on what matters. Think of it like a service mesh, but for AI agents.

Prior to building archgw, the team spent time building Envoy [2] at Lyft, API Gateway at AWS, specialized NLP models at Microsoft Research and worked on safety at Meta. archgw was born out of the belief that rule-based, single-purpose tools that handle the work around resiliency, processing and routing prompts should move into a dedicated infrastructure layer for agents, but built on the battle-tested foundational of Envoy Proxy.

The intelligence in archgw comes from our fast Task-specific LLMs [3] that can handle things like agent routing and hand off, guardrails and preference-based intelligent LLM calling. Here are some additional details about the open source project. archgw is written in rust, and the request path has three main parts:

* Listener subsystem which handles downstream (ingress) and upstream (egress) request processing. * Prompt handler subsystem. This is where archgw makes decisions on the safety of the incoming request via its prompt_guard hooks and identifies where to forward the conversation to via its prompt_target primitive. * Model serving subsystem is the interface that hosts all the lightweight LLMs engineered in archgw and offers a framework for things like hallucination detection of our these models

We loved building this open source project, and our belief is that this infra primitive would help developers build faster, safer and more personalized agents without all the manual prompt engineering and systems integration work needed to get there. We hope to invite other developers to use and improve Arch. Please give it a shot and leave feedback here, or at our discord channel [4] Also here is a quick demo of the project in action [5]. You can check out our public docs here at [6]. Our models are also available here [7].

[1] https://github.com/katanemo/archgw [2] https://www.envoyproxy.io/ [3] https://huggingface.co/collections/katanemo/arch-function-66... [4] https://discord.com/channels/1292630766827737088/12926307682... [5] https://www.youtube.com/watch?v=I4Lbhr-NNXk [6] https://docs.archgw.com/ [7] https://huggingface.co/katanemo

Comments

mutant•22h ago

Huh, this is pretty dope. I tried this example https://github.com/katanemo/archgw/blob/main/demos/samples_p...

And was pleased with what I was able to do. Thanks

sparacha•21h ago

That’s an example of what the edge component could do. Did you give the preference-based automatic routing a try?

mutant•20h ago

No, but I've already put this at the top of my tinker pile. I'm sure I will soon

isuckatcoding•8h ago

I’m still new to this ecosystem but is this something you’d use together with langchain or does it replace some use cases there?

honorable_coder•7h ago

What’s missing right now are our guides showing how well ArchGW integrates with existing frameworks and tools. But the core idea is simple: it offloads low-level responsibilities—like routing, safety, and observability—that frameworks like LangChain currently try to handle inside the app. That means less bloat and more clarity in your agent logic.

And importantly, some things just can’t be done well in a framework. For example, enforcing global rate limits across LLMs isn’t realistic when each agent instance holds its own local state. That kind of cross-cutting concern needs to live in infrastructure—not in application code.

jufter•7h ago

Was going to ask how this integrates into Envoy but dug into the code it looks like proxywasm which must mean `envoy.bootstrap.wasm` ?

honorable_coder•2h ago

We’re using proxy-wasm and compiling to wasm32-wasip1, then mounting the .wasm binaries into Envoy as HTTP filters via envoy.filters.http.wasm. The line you're referring to:

vm_config: runtime: "envoy.wasm.runtime.v8" code: local: filename: "/etc/envoy/proxy-wasm-plugins/prompt_gateway.wasm"

…is where the integration happens. There's no need to modify envoy.bootstrap.wasm; instead, Arch loads the WASM modules at runtime using standard Envoy config templating. The filters (prompt_gateway for ingress, and llm_gateway for egress sit in the request path and do things like prompt inspection, model routing, header rewrites, and telemetry collection.

Show HN: A Raycast-compatible launcher for Linux

Show HN: Learn LLMs LeetCode Style

Show HN: A Lisp for code generation and metaprogramming in non-Lisp languages

Show HN: A Browser-Only Dream Interpreter Using Symbol Logic and JavaScript

Show HN: I built this to talk Danish to my girlfriend – works with any language

Show HN: Type-safe PostgreSQL helpers for Kysely – arrays, JSONB, and vector ops

Show HN: ArchGW – an intelligent edge and service proxy for agents

Show HN: c0admin – A terminal-based AI assistant for Linux sysadmins

Show HN: I made a free tool to sync Strava activities with your calendar

Show HN: I made a JSFiddle-style playground to test and share prompts fast

Show HN: CMS-like editing for Markdown with contenteditable and 100 lines of JS

Show HN: The simplest way to use MCP. local-first. 100% open source

Show HN: An open-source, Android app for discovering privacy-respecting software

Show HN: Vibe Kanban – Kanban board to manage your AI coding agents

Show HN: Clu3 – Team up with GPTs in a 2v2 game of codenames

Show HN: DesignArena – crowdsourced benchmark for AI-generated UI/UX

Show HN: I built a toy music controller for my 5yo with a coding agent

Show HN: HNping 'remind me later' for HN via web push

Show HN: Pyhoff – Connect Python ML Models to Beckhoff/WAGO IO Hardware

Show HN: RULER – Easily apply RL to any agent

Show HN: Pangolin – Open source alternative to Cloudflare Tunnels

Show HN: I added Game of Life to my Portfolio Website and it's so cool

Show HN: Sohri – Turn short stories into binge-able audio episodes

Show HN: OffChess – Offline chess puzzles app

Show HN: We developed an AI tool to diagnose car problems

Show HN: An educational Local Qwen3 LLM Inference project written in Rust

Show HN: Cactus – Ollama for Smartphones

Show HN: FlopperZiro – A DIY open-source Flipper Zero clone

Show HN: CXXStateTree – A modern C++ library for hierarchical state machines

Show HN: Interactive pinout for the Raspberry Pi Pico 2

Show HN: ArchGW – an intelligent edge and service proxy for agents

Comments

Show HN: A Raycast-compatible launcher for Linux

Show HN: Learn LLMs LeetCode Style

Show HN: A Lisp for code generation and metaprogramming in non-Lisp languages

Show HN: A Browser-Only Dream Interpreter Using Symbol Logic and JavaScript

Show HN: I built this to talk Danish to my girlfriend – works with any language

Show HN: Type-safe PostgreSQL helpers for Kysely – arrays, JSONB, and vector ops

Show HN: ArchGW – an intelligent edge and service proxy for agents

Show HN: c0admin – A terminal-based AI assistant for Linux sysadmins

Show HN: I made a free tool to sync Strava activities with your calendar

Show HN: I made a JSFiddle-style playground to test and share prompts fast

Show HN: CMS-like editing for Markdown with contenteditable and 100 lines of JS

Show HN: The simplest way to use MCP. local-first. 100% open source

Show HN: An open-source, Android app for discovering privacy-respecting software

Show HN: Vibe Kanban – Kanban board to manage your AI coding agents

Show HN: Clu3 – Team up with GPTs in a 2v2 game of codenames

Show HN: DesignArena – crowdsourced benchmark for AI-generated UI/UX

Show HN: I built a toy music controller for my 5yo with a coding agent

Show HN: HNping 'remind me later' for HN via web push

Show HN: Pyhoff – Connect Python ML Models to Beckhoff/WAGO IO Hardware

Show HN: RULER – Easily apply RL to any agent

Show HN: Pangolin – Open source alternative to Cloudflare Tunnels

Show HN: I added Game of Life to my Portfolio Website and it's so cool

Show HN: Sohri – Turn short stories into binge-able audio episodes

Show HN: OffChess – Offline chess puzzles app

Show HN: We developed an AI tool to diagnose car problems

Show HN: An educational Local Qwen3 LLM Inference project written in Rust

Show HN: Cactus – Ollama for Smartphones

Show HN: FlopperZiro – A DIY open-source Flipper Zero clone

Show HN: CXXStateTree – A modern C++ library for hierarchical state machines

Show HN: Interactive pinout for the Raspberry Pi Pico 2