Stop Burning Your Context Window – How We Cut MCP Output by 98% in Claude Code

3•mksglu•1h ago

Comments

mksglu•1h ago

Author here. I shared the GitHub repo a few days ago (https://news.ycombinator.com/item?id=47148025) and got great feedback. This is the writeup explaining the architecture.

The core idea: every MCP tool call dumps raw data into your 200K context window. Context Mode spawns isolated subprocesses — only stdout enters context. No LLM calls, purely algorithmic: SQLite FTS5 with BM25 ranking and Porter stemming.

Since the last post we've seen 228 stars and some real-world usage data. The biggest surprise was how much subagent routing matters — auto-upgrading Bash subagents to general-purpose so they can use batch_execute instead of flooding context with raw output.

Source: https://github.com/mksglu/claude-context-mode Happy to answer any architecture questions.

jamiecode•1h ago

The 98% reduction is the real story here, but the systemic problem you're solving is even bigger than individual tool calls blowing up context. When you're orchestrating multi-step workflows, each tool output becomes part of the conversation state that carries forward to the next step. A Playwright snapshot at step 1 is 56 KB. It still counts at step 3 when you've moved on to something completely different.

The subprocess isolation is smart - stdout-only is the right constraint. I've been running multi-agent workflows where the cost of tool output accumulation forces you to make bad decisions: either summarise outputs manually (defeating the purpose of tool calls), truncate logs (information loss), or cap the workflow depth. None of them good.

The search ranking piece is worth noting. Most people just grep logs or dump chunks and let the LLM sort it out. BM25 + FTS5 means you're pre-filtering at index time, not letting the model do relevance ranking on the full noise. That's the difference between usable and unusable context at scale.

Only question: how does credential passthrough work with MCP's protocol boundaries? If gh/aws/gcloud run in the subprocess, how does the auth state persist between tool calls, or does each call reinit?

mksglu•1h ago

No magic — standard Unix process inheritance. Each execute() spawns a child process via Node's child_process.spawn() with a curated env built by #buildSafeEnv (https://github.com/mksglu/claude-context-mode/blob/main/cont...). It passes through an explicit allowlist of auth vars (GH_TOKEN, AWS_ACCESS_KEY_ID, GOOGLE_APPLICATION_CREDENTIALS, KUBECONFIG, etc.) plus HOME and XDG paths so CLI tools find their config files on disk. No state persists between calls — each subprocess inherits credentials from the MCP server's environment, runs, and exits. This works because tools like gh and aws resolve auth on every invocation anyway (env vars or ~/.config files). The tradeoff is intentional: allowlist over full process.env so the sandbox doesn't leak unrelated vars.

Harness engineering: leveraging Codex in an agent-first world

Show HN: Jarvish – The J.A.R.V.I.S. AI inside your shell investigates errors

Disposable Software: When generating code costs less than finding it

Show HN: DevIndex – Ranking 50k GitHub developers using a static JSON file

What if the next California-scale wildfire happens in the Midwest?

Show HN: SecLaw – Self-hosted AI agents on your machine, Docker-isolated

Show HN: Mycelio – A gig economy network for idle LLM agents

Tell HN: 3 months ago we feared AI was useless. Now we fear it will take our job

Trapped in MS Office

Handler – Open-source messaging app for AI agents

Httpx closing down issues and discussions due to "skewed gender representation"

Reddit is removing R/all

Atomic GraphRAG Demo: A Single Query Execution

Kakistocracy: Why Populism Ends in Disaster

Show HN: Speechos – Benchmark 25 speech AI models locally, no cloud needed

OpenAI – How to delete your account

The Future of AI

Ask HN: Is it time for an AI workers union?

Games media set for more layoffs, as IGN-owned Eurogamer cuts editorial staff

US and Israel launch attack on Iran

Show HN: Polpo – Control Claude Code (and other agents) from your phone

Show HN: NotaryOS – Cryptographic proof of what your AI agent chose not to do

1Password maybe not increasing prices

Claude Sonnet 4.6 says it is 我是 DeepSeek when asked in Chinese

Serve Markdown to LLMs from your Next.js app

Idea Hunting Is Dead. Databases Like This Are Replacing It

Magawa the HeroRAT

The Lazy Way to Find Your Next SaaS Idea

Pentagon puts Scouts 'on notice' over DEI and girl-centered policies

We Will Be Divided