frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: Headroom – Reversible context compression for LLMs(~60% cost reduction)

https://github.com/chopratejas/headroom
1•chopratejas•1h ago

Comments

chopratejas•1h ago
Author here. I built Headroom because I was spending $200/day running agents with tool calls.

The problem: tools return huge JSON (search results, DB queries, file listings). Each response bloats context. By turn 10, you're paying for 100k+ tokens on every LLM call.

Existing solutions have a fundamental tradeoff: - Truncation: fast but you might cut data the model needs - Summarization: slow (~500ms) and still lossy - Bigger context: just delays the problem, costs more

The insight behind Headroom:

You can't know which data matters until the model tries to use it. So instead of guessing, compress aggressively AND keep a retrieval path.

  1. Smart compression - not random truncation. For JSON arrays, we keep errors (100%), statistical anomalies, items matching the user's query (BM25 + embeddings), first/last items. For code, we use tree-sitter AST parsing to preserve imports, signatures, types - output is guaranteed syntactically valid. For logs, we keep errors and state transitions.

  2. CCR (Compress-Cache-Retrieve) - everything compressed gets cached locally. We inject a `headroom_retrieve` tool. If the model needs more data, it asks and gets it in <1ms.

  The retrieval is what makes aggressive compression safe. In practice, the model almost never retrieves because the smart compression keeps what matters. But when it does need more, it can get it.
Results on my workloads: - Search results (1000 items): 45k → 4.5k tokens (90%) - Agent with tools (10 calls): 100k → 15k tokens (85%) - Overhead: 1-5ms per request

Usage:

  As a proxy (zero code changes):
  pip install "headroom-ai[proxy]"
  headroom proxy --port 8787
  ANTHROPIC_BASE_URL=http://localhost:8787 claude
Or wrap your client: from headroom import HeadroomClient client = HeadroomClient(OpenAI())

LangChain integration is one line.

Limitations I'm aware of: - CCR adds memory overhead (LRU cache, configurable) - AST compression requires tree-sitter (~50MB) - Not battle-tested on all edge cases yet

Happy to answer questions about the compression algorithms, the retrieval mechanism, or anything else.

Cryptolog – The NSA's Internal Magazine

https://view.officeapps.live.com/op/view.aspx?src=https%3A%2F%2Fnsarchive%2Egwu%2Eedu%3A443%2Fsit...
1•polalavik•24s ago•0 comments

eBPF Party: Interactive eBPF Playground

https://ebpf.party/
1•signa11•1m ago•0 comments

A$AP Rocky Releases Helicopter Music Video Featuring Gaussian Splatting

https://radiancefields.com/a-ap-rocky-releases-helicopter-music-video-featuring-gaussian-splatting
1•ChrisArchitect•3m ago•0 comments

Meta Lays Off 1,500 People in Metaverse Division

https://www.wsj.com/tech/meta-layoffs-reality-labs-2026-347008b0
1•1vuio0pswjnm7•3m ago•1 comments

Google Gemini Can Proactively Analyze Users' Gmail, Photos, Searches

https://www.bloomberg.com/news/articles/2026-01-14/google-gemini-s-personalized-intelligence-feat...
1•1vuio0pswjnm7•5m ago•1 comments

Microsoft vows to 'pay its way' as it seeks to defuse data centre backlash

https://www.ft.com/content/3f392c9b-c07d-42f5-b000-0a7347ad1ec0
1•1vuio0pswjnm7•8m ago•0 comments

Mobile-first IDE demo for coding without a computer

https://clem1212.github.io/ide/
1•BobbyBrowntwin•11m ago•0 comments

Show HN: Free, maintenance‑free semantic search and related posts for Hexo

https://github.com/SemanticSearch-ai/hexo-plugin
1•oldcai•13m ago•0 comments

Texas Police Invested Millions in a Shadowy Phone-Tracking Software

https://www.texasobserver.org/texas-police-invest-tangles-sheriff-surveillance/
1•helsinkiandrew•17m ago•0 comments

Minecraft Damage Calculator

https://calcforge.net/tools/minecraft-damage-calculator/
1•bitvvip•18m ago•0 comments

Show HN: Skild – The NPM for AI agent skills

https://skild.sh
1•peiiii•18m ago•0 comments

NASA astronauts begin 'bittersweet' medical evacuation from space station

https://www.bbc.co.uk/news/articles/c205r8n0276o
1•lifeisstillgood•19m ago•0 comments

ChatGPT Translate

https://chatgpt.com/translate
2•chenzhekl•19m ago•2 comments

Two Thinking Machines Lab Cofounders Are Leaving to Rejoin OpenAI

https://www.wired.com/story/thinking-machines-lab-cofounders-leave-for-openai/
1•cpeterso•20m ago•0 comments

Just Get a Better Job

https://idiallo.com/blog/just-get-another-job
4•jnord•22m ago•1 comments

Relativity of Generative Aesthetics

https://jimiwen.substack.com/p/relativity-of-generative-aesthetics
1•jimiwen•22m ago•0 comments

US gov't: House sysadmin stole 200 phones, caught by House IT desk

https://arstechnica.com/tech-policy/2026/01/us-govt-house-sysadmin-stole-200-phones-caught-by-hou...
3•jnord•23m ago•1 comments

Stop using MySQL in 2026, it is not true open source

https://optimizedbyotto.com/post/reasons-to-stop-using-mysql/
2•gpi•26m ago•0 comments

Ask HN: Strongman Argument for Trump?

1•bbiab•27m ago•1 comments

Ripped off by Google? Places API

1•texuf•28m ago•0 comments

OpenAI is now selling 6x more codex for 10x the price

https://developers.openai.com/codex/pricing/
1•Szpadel•28m ago•1 comments

Eternal September

https://en.wikipedia.org/wiki/Eternal_September
2•_vaporwave_•29m ago•0 comments

New Model Fails to Explain Near-Death Experiences, Scientists Say

https://news.med.virginia.edu/research/new-model-fails-to-explain-near-death-experiences-scientis...
1•XzetaU8•30m ago•0 comments

Yori, I made a CLI tool that compiles natural language into C++ binaries

https://github.com/alonsovm44/yori
1•alonsovm•36m ago•1 comments

All horses are the same color

https://en.wikipedia.org/wiki/All_horses_are_the_same_color
1•icwtyjj•39m ago•0 comments

Curl: We stop the bug-bounty end of Jan 2026

https://github.com/curl/curl/pull/20312
2•Fiveplus•40m ago•0 comments

Growth / Decline of various mobile app SDKs

https://appgoblin.info/reports/mobile-apps-growth-sdks-2025
1•ddxv•47m ago•1 comments

Grok and the A.I. Porn Problem

https://www.newyorker.com/culture/infinite-scroll/grok-and-the-ai-porn-problem
2•fortran77•48m ago•1 comments

How to Debug Your Life

https://www.joanwestenberg.com/how-to-debug-your-life/
1•zdw•48m ago•0 comments

China's Zhipu Unveils New AI Model Trained on Huawei's Chips

https://www.bloomberg.com/news/articles/2026-01-14/china-s-zhipu-unveils-new-ai-model-trained-on-...
2•antman•50m ago•0 comments