Show HN: Max – a federated data query layer for AI agents (and humans)

https://max.cloud

7•benvan•5h ago

Hey HN! I built a thing and I'm really excited to share it.

EDIT: I meant to link to the github, not the website: https://github.com/max-hq/max

Like many of us here, I've been commonly reaching for a pattern of "pull data into db; give it to claude" for a while, whilst doing data spelunking or building tooling - for the same reasons mentioned by thellimist over here [1] and a few other recent "CLI vs MCP" posts.

To that end, about a month ago I started building a project called `max` - its goal is to cut the middleman and schematise any data source for you. Essentially, provide a lingua-franca for synchronising and searching data.

In short: Max exposes a CLI for any given data source, and mirrors it locally. As in, puts that data right next to the agent. It means search is local and fast, and ready for cut, sed, grep, sort etc.

More concretely:

> max connect @max/connector-gmail --name gmail-1 > max sync gmail-1

> # show me what data i can search for > max schema @max/conector-gmail

> # do a search > max search gmail-1 --filter"subject ~= Apples" --fields=subject,from,time

I've built a few connectors over at `max-hq/max-connectors` - but the goal is that they're easy to create (sync is done via graph walk - max makes you provide field resolution so it can figure out how to sync).

In practice - I've found that telling claude to run "max -g llm-bootstrap" to get acquainted, and then "make a connector for X" also works pretty well :).

There's a lot still to come(!) - realtime, somewhere to host connectors, exposing and serving max nodes... I'll be updating the roadmap over the next couple of days - but I didn't want to wait any longer before sharing here.

(on that note - max is designed for federation. The core is platform agnostic)

In terms of what this approach makes possible - I ran a benchmark on a challenge (it's the one on the website) asking claude to find me names of a particular form from a fairly chunky hubspot (100k contacts). The metrics are roughly what you'd expect from putting the data local and avoiding any tokens hitting claude's context window:

MCP: 18M tokens | 80m time | $180 cost

Max: 238 tokens | 27s time | $0.003 cost

(I'll explain how these numbers were calculated in a new reply)

It's still early (alpha) but if you're building agents or just want local data, please try it and tell me what breaks.

Thanks!

[1] https://news.ycombinator.com/item?id=47157398

Comments

benvan•5h ago

Regarding the metrics:

- Claude (via Hubspot MCP) was paginating over contacts, at 40s per 800 contacts and ~150k tokens (triggering compaction) - full run was 120 of these loops @ 80 minutes and 18M tokens

- Claude + Max was 1 `max search hubspot --filter` command piped to sort | uniq -c - plus 1 `max search gdrive` query matching each of the results of the previous query, piped to sort | uniq -c - The rest of the tokens were spent producing an output from 20 words + 20 numbers

(Both of these calculations ignore cached tokens)

The Quran's 950-Years of Noah Echoes the Ages of Kings in the Sumerian King List

More Is Different for Intelligence

What if CLIs exposed machine-readable contracts for AI agents?

The Monk at the Cocktail Party

Weather Report #1

A Million Simulated Seasons [video]

Incrementally parsing LLM Markdown streams on server/client

Show HN: Kula – Lightweight, self-contained Linux server monitoring tool

Show HN: Cross-Claude MCP – Let multiple Claude instances talk to each other

Poll

I'm 60 years old. Claude Code has ignited a passion again

SYNX – a config format that parses 67× faster than YAML, built for AI pipelines

All of this refugee case's filings should be online

Plasma Bigscreen – 10-foot interface for KDE plasma

GitHub appears to be hiding repo stars for signed-out users

Garrett Langley of Flock Safety on building technology to solve crime

Kafka 101

Show HN: MCP server that finds dev tool credits in your workflow

Helix: A post-modern text editor

Turns out making games is the easy part

Show HN: A governance pattern for self-evolving AI skills

Follow-Up: Build Awesome's Kickstarter Is Cancelled

London tech ecosystem map (235 companies)

Polymarket Removes Betting Market on Nuclear Detonation

Show HN: Agent Office – Slack for (OpenClaw Like) AI Agents

WebSocket+Huffman vs. SSE+JSON for streaming LLM tokens

Learning makes brain cells work together, not apart: study

Show HN: WTF-CLI – An AI-powered terminal error solver written in Rust

GoldRush Agent Skills for blockchain data and pricing

Show HN: Kaeso – infrastructure for connecting AI agents to real services