frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

A all CLIs tokens and context reducer by 97%

https://www.squeezr.es/
1•sergioramosv•1h ago

Comments

sergioramosv•1h ago
I've been using Claude Code and Codex daily for months. They're some of the best programming tools I've tried. But there's something nobody tells you when you start: context runs out fast, and the cost grows exponentially.

The real problem isn't the message you're sending When you're 50 messages into a session and you send message 51, your CLI doesn't just send that message. It sends all 51. The entire conversation, from the beginning, with every single request.

On top of that, Claude Code's system prompt is 13,000 characters — also sent with every message. Every command result the AI has run, every file it read, every search it performed — all of it is in the history, resent again and again.

Why existing tools don't fix this There's a very popular tool for this problem: RTK (Rust Token Killer), with over 16,000 GitHub stars. It does exactly what it promises: it works as a shell wrapper that intercepts the stdout of each command before it enters the context. When the AI runs git diff, RTK filters the output before the result is stored in the history.

Once a command result has entered the history, RTK can't touch it anymore. And on message 51, those 50 previous messages — with all their results, logs, file reads — are resent in full to the API. RTK has no visibility into the accumulated history.

In numbers: in a 50-turn session with 150,000 total tokens, RTK saves approximately 1.6%. It can only act on the current turn.

What I built Squeezr is a local HTTP proxy that intercepts each request before it reaches the API. It operates at a different level than RTK: not on the stdout of a single command, but on the complete HTTP request — it sees and compresses the entire conversation on every send.

The system prompt is compressed once and cached. From 13,000 chars down to ~650. On the next request, and the one after, it comes straight from cache — no recompression.

Command and tool results are filtered before they accumulate in the history. When the AI runs npm test and gets 200 lines back, Squeezr extracts only the failing tests. When it reads a file, it keeps what's relevant. When it searches, it compacts the results. Git commands, Docker, kubectl, compilers, linters — each has its own specific pattern. And unlike RTK, Squeezr also compresses file reads and search results, not just bash output.

The full history is compressed with every request. Older messages are summarized automatically. Message 51 doesn't resend 50 full conversations — it resends 48 compressed ones and the last 3 intact.

The result on that same 85,000 char example: 25,000 chars. 71% less, on every message. In long sessions, cumulative savings reach 89%.

No quality loss Compression is lossless. All original content is stored locally. If the AI needs more detail from something that was compressed, it calls squeezr_expand() and gets the full original back instantly — no cost, no API call.

The AI gets the same information. Without the filler.

AI compression uses the cheapest model you already have — no extra cost When a block is too long for deterministic patterns, Squeezr uses an AI model to summarize it — always the cheapest one from the provider you're already using: Haiku if you're on Claude, GPT-4o-mini if you're on Codex, Flash if you're on Gemini. And if you work with local models through Ollama or LM Studio, it uses local models too. No extra API keys, no additional cost.

What changed in practice Sessions last much longer. The AI keeps track because the context isn't filled with noise. And token spending dropped considerably:

Works today with Claude Code, Codex, Aider, and Gemini CLI. Cursor support is coming soon.

MIT. https://squeezr.es

If you try it, squeezr gain will tell you exactly how much you're saving.

LLM inference load balancer optimized for AMD Radeon VII GPUs

https://github.com/janit/viiwork
1•velmu•2m ago•0 comments

Show HN: I built a tool to show how much ARR you lose to FX fees

https://fixmyfx.com
1•TaniaBell_PD•7m ago•1 comments

3 New world class MAI models, available in Foundry

https://microsoft.ai/news/today-were-announcing-3-new-world-class-mai-models-available-in-foundry/
2•geox•8m ago•0 comments

Get alerts of stolen bikes in your area – Register your bike in case of theft

https://bikewatch.app
1•fullstacking•11m ago•1 comments

The Health and Healthcare Spending Effects of GLP-1s

https://www.nber.org/digest/202604/health-and-healthcare-spending-effects-glp-1s
1•neehao•13m ago•0 comments

Steam to Show Estimated FPS

https://www.tomshardware.com/video-games/pc-gaming/steam-starts-gathering-fps-data-with-latest-cl...
1•ortusdux•15m ago•0 comments

Gstack for Learning Chinese

https://github.com/geometer-jones/the-big-learn
1•geometerJones•15m ago•1 comments

KDE is getting support for the xx-fractional-scale-v2 Wayland protocol

https://www.neowin.net/news/kde-is-getting-support-for-the-xx-fractional-scale-v2-wayland-protocol/
1•bundie•15m ago•0 comments

Onepilot – Deploy AI coding agents to remote servers from your iPhone

https://onepilotapp.com
8•elmlabs•23m ago•4 comments

Tandem: An IDE for non-code docs for real-time collaboration with Claude Code

https://github.com/bloknayrb/tandem
2•bloknayrb•24m ago•1 comments

Show HN: A Dad Joke Website

https://joshkurz.net/
2•joshkurz•24m ago•0 comments

Everything I hate about the Mac

https://blog.d11r.eu/mac/
4•dominicq•25m ago•2 comments

No Agenda, No Meeting

https://noagendanomeeting.net
2•benbalter•25m ago•1 comments

Show HN: Real-time AI (audio/video in, voice out) on an M3 Pro with Gemma E2B

https://github.com/fikrikarim/parlor
2•karimf•26m ago•0 comments

Moon Mission Orbit Animations

https://sankara.net/astro/lunar-missions/mission.html?mission=artemis2
1•jaypatelani•26m ago•0 comments

New Programming Language – Codescript

https://github.com/GHisaque/Codescript/releases/tag/v1.0.0
1•IsaqueCrystal•26m ago•2 comments

Prysma: Anatomy of an LLVM Compiler Built from Scratch in 8 Weeks

https://old.reddit.com/r/Compilers/comments/1sccdmi/prysma_anatomy_of_an_llvm_compiler_built_from/
1•zyphorah•27m ago•1 comments

AI Is Rewiring World's Most Prolific Film Industry

https://www.reuters.com/technology/ai-is-rewiring-worlds-most-prolific-film-industry-2026-04-04/
1•rcarr•27m ago•0 comments

Callvent – I built an app that turns phone calls into calendar events

https://callvent.app/en/blog/building-callvent/
1•robertmittl•29m ago•0 comments

Ask HN: LLM-Based Spam Filter

1•michidk•36m ago•0 comments

Show HN: Built a model-agnostic, desktop-native, research studio for local files

https://old.reddit.com/r/LLMDevs/comments/1sbusn8/new_pdfviewer_notes_panel_search_downloader_tool/
1•ieuanking•38m ago•0 comments

Josefina Aguilar, maestra artesana del barro, murió a los 80 añOS

https://www.nytimes.com/es/2026/04/02/espanol/cultura/josefina-aguilar-artesana.html
1•paulpauper•43m ago•0 comments

The CA Minimum Wage Increase: Summing Up

https://marginalrevolution.com/marginalrevolution/2026/04/the-ca-minimum-wage-increase-summing-up...
2•paulpauper•43m ago•0 comments

What if everything still ran on vacuum tubes? [video]

https://www.youtube.com/watch?v=mEpnRM97ACQ
2•marklit•44m ago•1 comments

Smartphones, Online Music Streaming, and Traffic Fatalities

https://www.nber.org/papers/w34866
1•naves•45m ago•0 comments

Claude Code skill to preserve traditional Unix style conventions

https://github.com/agiacalone/unix-conventions
2•agiacalone•46m ago•1 comments

How Close Is Too Close? Applying Fluid Dynamics Research Methods to PC Cooling

https://www.lttlabs.com/articles/2026/04/04/how-close-is-too-close-applying-fundamental-fluid-dyn...
1•LabsLucas•46m ago•1 comments

DIY Air Drums

https://www.instructables.com/SpaceDrums-Play-Drums-in-the-Air/
2•nlarion•49m ago•0 comments

Marc Andreessen on why "this time is different" in AI

https://www.latent.space/p/pmarca
3•theorchid•51m ago•0 comments

Microsoft Hasn't Had a Coherent GUI Strategy Since Petzold

https://www.jsnover.com/blog/2026/03/13/microsoft-hasnt-had-a-coherent-gui-strategy-since-petzold/
8•naves•51m ago•1 comments