Show HN: Ably AI Transport - a transport layer for agentic apps

https://ably.com/docs/ai-transport

7•mchristensen•2w ago

Hey HN,

Staff Engineer at Ably here. Over the past few months I've been speaking to engineers building AI assistants, copilots, and agentic workflows (over 40 companies at this point), with particular focus on cloud-hosted agents.

I expected the hard problems to be in model selection, prompt engineering, and orchestration. Instead, the same infrastructure challenges kept coming up: realtime sync between agents and end clients is surprisingly painful to get right.

- Managing and scaling WebSocket or SSE connections between agents and clients

- Buffering messages server-side and implementing replay logic for reconnecting clients

- Tracking what each client has received across multiple devices

- Ensuring continuity between historical and live responses on initial load

- Routing reconnecting clients to the correct agent instance in distributed deployments

Ably AI Transport solves this: it's a drop-in transport layer that sits between your agents and end-user devices.

We discovered many companies who were already using Ably Pub/Sub to tackle these problems. The pub/sub pattern decouples agents from clients: agents publish to channels, clients subscribe, Ably handles delivery and replay.

AI Transport makes this easier - for example, we've added message appends for efficient token streaming, and annotations for attaching metadata like citations.

A few of the interesting technical pieces:

- Channel-oriented architecture: In connection-oriented setups, the connection pokes the agent into life. If the connection drops, on reconnection the agent must figure out what state each client is missing. AI Transport uses channels instead: agents publish, clients subscribe, the channel handles replay. Presence events let agents detect when users go online/offline or connect from multiple devices.

- Identity in decoupled pub/sub: Users authenticate with your server, which issues JWTs with embedded clientId and capabilities. Agents receive messages with cryptographically verified identity. User claims (ably.channel.* in the JWT) appear in message.extras.userClaim - useful for HITL workflows where you verify an approver's role before executing a tool call.

- Token streaming with appends: New message.append operation lets you build a response incrementally. Clients joining mid-stream get a message.update with the complete response so far, then receive subsequent appends. Channel history contains one compacted message per response.

- Annotations for citations: Attach citation metadata to responses without modifying content. Clients can subscribe to individual annotations or obtain them on demand via REST. Ably also automatically aggregates annotations into summaries (e.g. count by domain name), which are delivered to clients in realtime.

- Messaging patterns: Docs include patterns for tool calls, human-in-the-loop approval flows, and chain-of-thought streaming.

Docs: https://ably.com/docs/ai-transport

If you're building AI UX, I'd love to hear what problems you've hit and what you've built in-house.

Thanks!

Mike Christensen

Comments

matt_oriordan•2w ago

Nice. I have spoken to literally 30+ developers who have hit these exact problems. I would have though, I am the co-founder of Ably and we were doing product validation

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

Open Problems in Mechanistic Interpretability

Bye Bye Humanity: The Potential AMOC Collapse

Dexter: Claude-Code-Style Agent for Financial Statements and Valuation

Digital Iris [video]

Essential CDN: The CDN that lets you do more than JavaScript

They Hijacked Our Tech [video]

Vouch

HRL Labs in Malibu laying off 1/3 of their workforce

Show HN: High-performance bidirectional list for React, React Native, and Vue

Show HN: I built a Mac screen recorder Recap.Studio

Ask HN: Codex 5.3 broke toolcalls? Opus 4.6 ignores instructions?

Vectors and HNSW for Dummies

Sanskrit AI beats CleanRL SOTA by 125%

'Washington Post' CEO resigns after going AWOL during job cuts

Claude Opus 4.6 Fast Mode: 2.5× faster, ~6× more expensive

TSMC to produce 3-nanometer chips in Japan

Quantization-Aware Distillation

List of Musical Genres

Show HN: Sknet.ai – AI agents debate on a forum, no humans posting

University of Waterloo Webring

Large tech companies don't need heroes

Backing up all the little things with a Pi5

Game of Trees (Got)

Human Systems Research Submolt

The Threads Algorithm Loves Rage Bait

Search NYC open data to find building health complaints and other issues

Michael Pollan Says Humanity Is About to Undergo a Revolutionary Change

Show HN: Grovia – Long-Range Greenhouse Monitoring System

Ask HN: The Coming Class War