frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Open in hackernews

Show HN: ArchGW – an intelligent edge and service proxy for agents

33•honorable_coder•21h ago
Hey HN!

This is Adil, Salman and Jose and and we’re behind archgw [1]. An intelligent proxy server designed as an edge and AI gateway for agents - one that natively know how to handle prompts, not just network traffic. We’ve made several sweeping changes so sharing the project again.

A bit of background on why we’ve built this project. Building AI agent demos is easy, but to create something production-ready there is a lot of repeat low-level plumbing work that everyone is doing. You’re applying guardrails to make sure unsafe or off-topic requests don’t get through. You’re clarifying vague input so agents don’t make mistakes. You’re routing prompts to the right expert agent based on context or task type. You’re writing integration code to quickly and safely add support for new LLMs. And every time a new framework hits the market or is updated, you’re validating or re-implementing that same logic—again and again.

Putting all the low-level plumbing code in a framework gets messy to manage, harder to update and scale. Low-level work isn't business logic. That’s why we built archgw - an intelligent proxy server that handles prompts during ingress and egress and offers several related capabilities from a single software service. It lives outside your app runtime, so you can keep your business logic clean and focus on what matters. Think of it like a service mesh, but for AI agents.

Prior to building archgw, the team spent time building Envoy [2] at Lyft, API Gateway at AWS, specialized NLP models at Microsoft Research and worked on safety at Meta. archgw was born out of the belief that rule-based, single-purpose tools that handle the work around resiliency, processing and routing prompts should move into a dedicated infrastructure layer for agents, but built on the battle-tested foundational of Envoy Proxy.

The intelligence in archgw comes from our fast Task-specific LLMs [3] that can handle things like agent routing and hand off, guardrails and preference-based intelligent LLM calling. Here are some additional details about the open source project. archgw is written in rust, and the request path has three main parts:

* Listener subsystem which handles downstream (ingress) and upstream (egress) request processing. * Prompt handler subsystem. This is where archgw makes decisions on the safety of the incoming request via its prompt_guard hooks and identifies where to forward the conversation to via its prompt_target primitive. * Model serving subsystem is the interface that hosts all the lightweight LLMs engineered in archgw and offers a framework for things like hallucination detection of our these models

We loved building this open source project, and our belief is that this infra primitive would help developers build faster, safer and more personalized agents without all the manual prompt engineering and systems integration work needed to get there. We hope to invite other developers to use and improve Arch. Please give it a shot and leave feedback here, or at our discord channel [4] Also here is a quick demo of the project in action [5]. You can check out our public docs here at [6]. Our models are also available here [7].

[1] https://github.com/katanemo/archgw [2] https://www.envoyproxy.io/ [3] https://huggingface.co/collections/katanemo/arch-function-66... [4] https://discord.com/channels/1292630766827737088/12926307682... [5] https://www.youtube.com/watch?v=I4Lbhr-NNXk [6] https://docs.archgw.com/ [7] https://huggingface.co/katanemo

Comments

mutant•19h ago
Huh, this is pretty dope. I tried this example https://github.com/katanemo/archgw/blob/main/demos/samples_p...

And was pleased with what I was able to do. Thanks

sparacha•19h ago
That’s an example of what the edge component could do. Did you give the preference-based automatic routing a try?
mutant•17h ago
No, but I've already put this at the top of my tinker pile. I'm sure I will soon
isuckatcoding•5h ago
I’m still new to this ecosystem but is this something you’d use together with langchain or does it replace some use cases there?
honorable_coder•4h ago
What’s missing right now are our guides showing how well ArchGW integrates with existing frameworks and tools. But the core idea is simple: it offloads low-level responsibilities—like routing, safety, and observability—that frameworks like LangChain currently try to handle inside the app. That means less bloat and more clarity in your agent logic.

And importantly, some things just can’t be done well in a framework. For example, enforcing global rate limits across LLMs isn’t realistic when each agent instance holds its own local state. That kind of cross-cutting concern needs to live in infrastructure—not in application code.

jufter•4h ago
Was going to ask how this integrates into Envoy but dug into the code it looks like proxywasm which must mean `envoy.bootstrap.wasm` ?

How does a screen work?

https://www.makingsoftware.com/chapters/how-a-screen-works
228•chkhd•7h ago•59 comments

Show HN: A Raycast-compatible launcher for Linux

https://github.com/ByteAtATime/raycast-linux
103•ByteAtATime•4h ago•24 comments

A technical look at Iran's internet shutdowns

https://zola.ink/blog/posts/a-technical-look-at-irans-internet-shutdown
54•znano•4h ago•21 comments

Five companies now control over 90% of the restaurant food delivery market

https://marketsaintefficient.substack.com/p/five-companies-now-control-over-90
28•goinggetthem•49m ago•10 comments

Reading Neuromancer for the first time in 2025

https://mbh4h.substack.com/p/neuromancer-2025-review-william-gibson
321•keiferski•13h ago•283 comments

The Gottorf Globe and its reconstruction

https://gottorfer-globus.de/en/the-gottorf-globe
8•Archelaos•1h ago•2 comments

Does showing seconds in the system tray actually use more power?

https://www.lttlabs.com/blog/2025/07/11/does-showing-seconds-in-the-system-tray-actually-use-more-power
97•LorenDB•3h ago•85 comments

GLP-1s Are Breaking Life Insurance

https://www.glp1digest.com/p/how-glp-1s-are-breaking-life-insurance
151•alexslobodnik•2h ago•187 comments

The North Korean fake IT worker problem is ubiquitous

https://www.theregister.com/2025/07/13/fake_it_worker_problem/
92•rntn•9h ago•76 comments

Show HN: Learn LLMs LeetCode Style

https://github.com/Exorust/TorchLeet
90•Exorust•8h ago•10 comments

C3 solved memory lifetimes with scopes

https://c3-lang.org/blog/forget-borrow-checkers-c3-solved-memory-lifetimes-with-scopes/
65•lerno•2d ago•57 comments

Axon's Draft One AI Police Report Generator Is Designed to Defy Transparency

https://www.eff.org/deeplinks/2025/07/axons-draft-one-designed-defy-transparency
180•zdw•2d ago•117 comments

Infisical (YC W23) Is Hiring DevRel Engineers

https://www.ycombinator.com/companies/infisical/jobs/qCrLiJb-developer-relations
1•vmatsiiako•4h ago

How to scale RL to 10^26 FLOPs

https://blog.jxmo.io/p/how-to-scale-rl-to-1026-flops
16•jxmorris12•3d ago•0 comments

Fine dining restaurants researching guests to make their dinner unforgettable

https://www.sfgate.com/food/article/data-deep-dives-bay-area-fine-dining-restaurants-20404434.php
22•borski•5h ago•60 comments

Hungary's oldest library fighting to save 100k books from a beetle infestation

https://www.nbcnews.com/world/hungary/hungary-pannonhalma-archabbey-beetle-infestation-rcna218539
49•rntn•2h ago•18 comments

The upcoming GPT-3 moment for RL

https://www.mechanize.work/blog/the-upcoming-gpt-3-moment-for-rl/
153•jxmorris12•4d ago•59 comments

Holographic ribbon aims to oust magnetic tape with 50-year life span and 200TB

https://www.tomshardware.com/pc-components/storage/holographic-ribbon-aims-to-oust-magnetic-tape-with-50-year-life-span-and-200tb-capacity-per-cartridge-holomem-says-optical-ribbon-based-carts-work-with-some-components-of-existing-systems-reducing-fricition
12•freddier•1h ago•5 comments

The Robot Sculptors of Italy

https://www.bloomberg.com/features/2025-robot-sculptors-marble/
40•helsinkiandrew•3d ago•7 comments

Most people who buy games on Steam never play them

https://howtomarketagame.com/2025/06/03/most-people-who-buy-your-game-wont-play-it/
149•3Samourai•3h ago•136 comments

Local Chatbot RAG with FreeBSD Knowledge

https://hackacad.net/post/2025-07-12-local-chatbot-rag-with-freebsd-knowledge/
46•todsacerdoti•7h ago•3 comments

Notes on Graham's ANSI Common Lisp (2024)

https://courses.cs.northwestern.edu/325/readings/graham/graham-notes.html
80•oumua_don17•3d ago•28 comments

The Decipherment of the Dhofari Script

https://www.science.org/content/article/mysterious-pre-islamic-script-oman-finally-deciphered
51•pseudolus•10h ago•17 comments

Monitoring My Homelab, Simply

https://b.tuxes.uk/simple-homelab-monitoring.html
70•Bogdanp•3d ago•26 comments

Understanding Tool Calling in LLMs – Step-by-Step with REST and Spring AI

https://muthuishere.medium.com/understanding-tool-function-calling-in-llms-step-by-step-examples-in-rest-and-spring-ai-2149ecd6b18b
69•muthuishere•11h ago•20 comments

Bypassing Google's big anti-adblock update

https://0x44.xyz/blog/web-request-blocking/
934•deryilz•1d ago•803 comments

Are a few people ruining the internet for the rest of us?

https://www.theguardian.com/books/2025/jul/13/are-a-few-people-ruining-the-internet-for-the-rest-of-us
4•pseudolus•28m ago•3 comments

Edward Burtynsky's monumental chronicle of the human impact on the planet

https://www.newyorker.com/culture/photo-booth/earths-poet-of-scale
104•pseudolus•18h ago•16 comments

Lua beats MicroPython for embedded devs

https://www.embedded.com/why-lua-beats-micropython-for-serious-embedded-devs
62•willhschmid•12h ago•55 comments

Gaming cancer: How citizen science games could help cure disease

https://thereader.mitpress.mit.edu/how-citizen-science-games-could-help-cure-disease/
97•pseudolus•10h ago•41 comments