Show HN: ToolMesh – turn all your REST APIs into MCP tools via declarative YAML

2•axeldunkel•1h ago

When at night the pager goes off, I ask Claude: "what is alerting, what changed in the last hour?". Claude answers by chaining calls across Graylog, Prometheus, Alertmanager, Linode, GitLab, NetBox and more. The menu of tools Claude has access to is even bigger than that: I have connected 30 backends so far (20 in the public registry, the rest internal to my setup), including most of my ops stack (OPNsense, Tailscale, Xen Orchestra, DokuWiki and more). ToolMesh is what makes that menu composable for Claude.

Each backend is a simple DADL file - a small YAML that declares the REST API of the service to ToolMesh, which then exposes those tools to Claude. Most of the publicly available DADLs (currently 20 with 1,833 tools in total) were drafted by an LLM in minutes and tuned from there. The registry is public.

Here is the HN API as DADL - the API behind this very page:

  tools:
    get_top_stories:
      method: GET
      path: /topstories.json
      access: read
      description: "Up to 500 top story IDs, ordered by HN ranking"
    get_item:
      method: GET
      path: /item/{id}.json
      access: read
      description: "Get story, comment, job, poll, or pollopt by ID"
      params:
        id: { type: integer, in: path, required: true }

How can a single agent access so many backends without creating context overflow? Code Mode. Naively, every tool and schema goes into context - 50,000+ tokens before the agent does anything useful. ToolMesh compresses that to ~1,000 by giving the model a typed API surface and letting it ask for endpoint details only when it needs them. That is the difference between "doesn't scale" and "please add 10 more, it's fine!". ToolMesh can also connect to other MCP servers, rendering them code mode capable as well.

Security in mind: credentials never reach the model (they are injected at runtime). ToolMesh runs a fail-closed pipeline: auth -> authz -> credential injection -> exec -> output gate -> audit. CallerClass lets the same API have different policy per client type (local dev assistant vs hosted agent vs CI bot). Every call lands in a SQLite-queryable audit log - "what did the agent do Tuesday?" becomes a SQL query, not a shrug.

ToolMesh is not magic. APIs with stateful flows or weird auth still need care, and an LLM with a great tool surface can still pick the wrong tool. You still need sane policy.

Try before cloning: https://demo.toolmesh.io is a public instance with the HN API loaded (login dadl/toolmesh). Connect Claude Desktop, Claude Code, or ChatGPT in 30 seconds: https://toolmesh.io/demo

GitHub: https://github.com/DunkelCloud/ToolMesh Docs: https://toolmesh.io DADL Spec + Registry: https://dadl.ai

Apache 2.0, single Go binary or Docker, no SaaS dependency.

If you think of your full ops stack - what DADLs would you like to have available to your LLM?

Comments

axeldunkel•1h ago

A bit more on DADL, since this is what people typically ask first - why ANOTHER standard?

DADL is on purpose narrower than e.g. OpenAPI. It describes only the tool surface that an agent is allowed to call - not the full API contract that humans, SDK generators, gateways, docs and mocks need. In practice this means fewer parts to think about: method, path, parameters, access class, descriptions, and policy metadata. The point is to make the allowed actions explicit and small enough to actually review.

Every MCP project I have seen wraps APIs imperatively - custom code per backend. DADL is the only declarative format I know of that makes the allowed surface reviewable in a PR diff. That is a deliberate trade-off: less flexibility, more auditability.

Why YAML? Because humans really read and edit it. I wanted something small enough to review in a PR, diff cleanly, easy for LLMs to generate AND to write by hand when needed. In practice this is more important than maximum expressiveness.

What DADL can do: describe HTTP tools with typed parameters, declare auth requirements, attach policy metadata and caller constraints, provide a compact tool surface to the model - and attach a simple 'access' badge to each tool that flags it as read-only or dangerous. And errors come back in a form the LLM can reason about, not as crashes that break the flow.

What DADL is not trying to do: replace OpenAPI, capture every edge case of complex APIs, or be a full SDK generation format.

A few questions I get often:

Does this work for every API? No. APIs with very stateful flows, weird auth handshakes, streaming edge cases or messy responses still need custom handling. Some APIs map cleanly to DADL, some do not - but for those that do not, you can still plug in an existing MCP server through ToolMesh, and Code Mode applies to it too.

Why not generate from OpenAPI? OpenAPI is a great source material. You point an LLM to the DADL specification, to the OpenAPI definition - and you get a valid DADL that usually only needs optimisation.

So far there are 20 DADLs in the public registry covering 1,833 tools (GitHub, Cloudflare, GitLab, DeepL, Hetzner Cloud and more). If there is a specific API you would want to see as DADL, just ask - I am happy to add it.

If you want to try before cloning: https://demo.toolmesh.io is a public instance with the HN APIs loaded (login dadl/toolmesh). Works with Claude.ai, Claude Desktop, Claude Code, and ChatGPT - setup takes 30 seconds: https://toolmesh.io/demo

axeldunkel•1h ago

Good question on Code Mode internals.

In Code Mode the model sees only two tools by default: list_tools(pattern) and execute_code(code). list_tools takes a regex and returns TypeScript signatures for matching tools. execute_code runs JavaScript that calls them.

So when the model actually needs the GitHub API for example, it calls list_tools("github.*pull") - it gets back just the typed signatures for those endpoints, and then writes code against them. Your second hypothesis is the mechanism: a meta-tool that queries on demand. The typed signatures (first hypothesis) are what the model reasons over once it has them.

That is what really brings the cost down. A large API as MCP tool definitions is easily 40-50k tokens upfront. The same API via list_tools + execute_code is ~1k for the two tool descriptions, plus only the signatures the model pulls per query.

Does Point Cloud Boost Spatial Reasoning of Large Language Models?

Chip Startup Aims to Shatter AI's Dreaded Memory Wall

Mild dehydration affects mood in healthy young women

Scaling Test-Time Compute for Agentic Coding

Show HN: Rocky – Rust SQL engine with branches, replay, column lineage

Workday's Last Workday

Big AI cluster little power the 8x Nvidia GB10 cluster

The Monetization of Spam

The Making of Digital Identity: Mobile Revolution and the Surveillance Machine

The Pragmatic Prompter

Show HN: Keychase – A zero-config, offline Python secret scanner

GPT-5.5 prompt for Codex tries to make it not talk about goblins

I Ditched the iCloud Keychain

Scammers use Gmail dot alias trick to spoof Robinhood in phishing scam

OpenAI's revenue, growth estimates fall short as company races toward IPO

Show HN: Integrations gateway for agents with 2FA for destructive ops (OSS)

U.S. is 'being humiliated by Iran,' says Germany's Merz

Free retirement tax calculators – SS tax torpedo, IRMAA cliffs, Roth ladders

An AI prompt-injected another AI in the wild and recognized it had succeeded

Wysiwyg LaTeX Editor Compositor for Windows and macOS Release 0.7

Open grid data has a public benefit

PostgreSQL and the OOM Killer: Why We Use Strict Memory Overcommit

Clojure on Fennel part three: parsing

Understanding Systems

Proton VPN's promises post-quantum groundwork, Stealth for Linux and new apps

Where Optimizations Come From

Start with the sensors, then design the rest: How Zoox built its robotaxi

Microsoft Ecommerce Platform Options for Scalable Online Commerce

I Built RescueFill a Lead Recovery and Automation Platform

Palantir, Thales and startups competing to build FAA's predictive air traffic AI