frontpage.

When you have multiple MCP servers, every request to the LLM will include all of their tools and descriptions, which can quickly eat up your token limit and increase costs. The thing is, most of the time, you don't need all of them.

For example, let’s take three popular MCP servers: Notion, GitHub, and Pylance. The overhead they create on every turn is about 26K tokens. If we assume an average 50-turn coding session and Opus pricing, the overhead for a single session is about $0.9275.

`mcp-compress-router` does something very simple: it proxies all MCP servers with just two tools: `get_tool_schema` and `invoke_tool`. `invoke_tool` proxies the call to the downstream MCP server. The `get_tool_schema` description lists the tool names and arguments for all downstream MCP server tools so that the agent knows what's available. Whenever it needs a tool, it first calls `get_tool_schema` to read the full description and argument schema, and then calls `invoke_tool`.

The savings are pretty serious. The example of 3 MCP servers is compressed to 900 tokens with the "max" compression level (just tool names), or to about 2000 tokens with the "high" compression level (the default one: tool names plus argument names). So you'll be saving 90%+ this way.

Camera traps exonerate endangered tapir blamed for crop damage in Honduras

Stop opening Outlook just to check your calendar

IDE Doesn't Belong in my .gitignore File

Generative UI doesn't make sense for startups

Solod v0.2: Networking, new targets, friendlier interop

I time travelled to Ancient Rome (AI Vlog) [video]

Highlights from Git 2.55 – The GitHub Blog

Wall Street Bets Micron Is the Next Nvidia AI Winner

LoopFlow – design loops that prompt your coding agent

IBM says it can fit nearly 100B transistors on a chip

Tool to generate iOS, Android, Mac, Windows, etc. icons from a single image

Structural Correctness

Publishers sue OpenAI, Microsoft for training ChatGPT with their content

Deep Dive: High Throughput Migration

Magpie-Search, THE BEST SEARCH ENGINE FOR LLM'S

Show HN: Crossbeam-CLI – Connect Claude to Crossbeam without the enterprise tier

Cursor for iOS

TraceLab: Characterizing Coding Agent Workloads for LLM Serving

Code Club – A 100% offline-first IDE from arg

Towards efficient matching of regexes with backreferences using register set aut

Is the vibecession real – or is the survey broken?

Show HN: AST-guard A gradient-immune structural guard against RL reward hacking

I made a fun little quiz. Can you spot if the text is written with AI or not?

Strategy may sell up to $1.25B in Bitcoin to calm investor jitters

AI Is Not a Tool, It Is an Environment

SCOTUS: 4th ammendment is implicated in a geofence warrant

US lags other countries in social media restrictions, a reform push is growing

An edtech pro uses Raspberry Pis as thin clients

How to build fast hierarchies for game objects using data oriented design

Announcing the Building 32 podcast, from MIT CSAIL Alliances

Show HN: MCP-compress-router – MCP Compressor