frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Show HN: TokenShield – Local proxy that dedupes Claude Code conversation traffic

https://www.npmjs.com/package/@curatedmcp/tokenshield
1•curatedmcp•34m ago
I built a local proxy that dedupes Claude Code traffic. TokenShield — cuts your Claude Code bill 40-70%

Comments

curatedmcp•33m ago
I run Claude Code most days and my Anthropic bill kept creeping up without me understanding which conversations were the expensive ones. A 25-turn agentic session re-reads `auth.ts` five times and re-runs `gh pr list` three times — every duplicate ships as a fresh tool_result content block to the model, every time. The model already saw identical bytes two turns ago, but it doesn't matter; you pay for them again.

TokenShield is a small Node 22 proxy that sits in front of api.anthropic.com. Set `ANTHROPIC_BASE_URL=http://127.0.0.1:7777` and Claude Code (or Cursor, Windsurf, Zed, Aider — anything that uses the Anthropic SDK) routes through it. The proxy hashes every tool_result content block per-conversation; subsequent occurrences of the same hash within the same conversation are replaced with a deterministic stub:

  { "type": "tool_result",
    "tool_use_id": "...",
    "content": "[tokenshield: identical content seen at message 2, sha:1f6063fe]" }
The deterministic part matters — same input always produces the same output, so Anthropic's prompt_cache (cache_control) stays valid.

The accounting side was the bug-hunt I didn't expect. First version captured 0 tokens on successful responses. Turned out Anthropic returns Content-Encoding: gzip and I was running JSON.parse on the compressed buffer. The silent try/catch swallowed the error and I shipped "200 status, $0.00 spent" for a week before noticing. Fix: `accept-encoding: identity` on upstream — bandwidth on localhost doesn't matter and reliable measurement does.

Real numbers from my own usage yesterday: one Opus-4 turn paid $0.1719, saved $0.0747 — 30.3% on that turn. Bench numbers across three recorded fixtures (light Q&A, medium coding, heavy 25-turn agentic): 0%, 27.7%, 62.1%. The light case correctly does nothing — there's nothing to dedup in a 5-turn conversation with no tool use.

What's deliberately not here:

- It only helps API-billed clients. Claude Desktop and claude.ai web are flat-subscription / hardcoded endpoint — no metered bill to compress. - OpenAI / Gemini are on a waitlist. The provider adapter interface is in the code, but I'd rather ship one provider correctly than three half-baked. - Compression is conservative (>= 256 byte payloads, content-hash keyed). It does not touch streaming SSE bytes — passthrough is byte-faithful so the prompt cache stays valid.

The piece I'd most appreciate feedback on: I'm staring at `response-cache` as the next processor (LRU+TTL on (model, system_prompt_hash, last_user_msg_hash)), but the correctness story scares me — `temperature > 0` and tool-using conversations both make caching risky. Would love thoughts from anyone who has done semantic caching on LLM responses about where the landmines are.

`npm i -g @curatedmcp/tokenshield` to try it. MIT, self-hosted, your API key never leaves the machine, telemetry is opt-in and aggregate-only.

More at: https://curatedmcp.com/tokenshield

Depression linked to bacterium-chemical interaction in personal care products

https://tech-paper.com/new-research-found-that-depression-may-begin-in-your-gut-when-a-common-bac...
1•cachecrab•37s ago•0 comments

The Sunk Cost Fallacy and How It Influences Our Decisions

https://almossawi.substack.com/p/the-sunk-cost-fallacy
1•anarbadalov•1m ago•0 comments

Andrej Karpathy Joins Anthropic

https://www.thevccorner.com/p/breaking-andrej-karpathy-joins-anthropic
2•vinni2•2m ago•0 comments

Google Antigravity CLI

https://antigravity.google/blog/introducing-google-antigravity-cli
2•jbirnick•3m ago•0 comments

Google introduces Gemini Spark, a 24/7 agentic assistant with Gmail integration

https://techcrunch.com/2026/05/19/google-introduces-gemini-spark-a-24-7-agentic-assistant-with-gm...
1•gfortaine•4m ago•0 comments

Show HN: Logbox – let Claude monitor your dev logs

https://github.com/struct-dot-ai/logbox
2•nimeshmc•4m ago•0 comments

Likely AI-generated short story won a major prize

https://twitter.com/nabeelqu/status/2056397504824963296
2•thatoneengineer•4m ago•0 comments

Show HN: Melogen – Generate MIDI melodies for free

https://www.melogen.ai/
1•squirrelon•7m ago•0 comments

Show HN: FastBack end – schema-first back end runtime with OpenAPI output

https://github.com/darula-hpp/fastbackend
1•ombedzi•7m ago•0 comments

The Gemini app becomes more agentic, delivering proactive, 24/7 help

https://blog.google/innovation-and-ai/products/gemini-app/next-evolution-gemini-app/
2•gfortaine•10m ago•0 comments

Disney Erased FiveThirtyEight

https://www.natesilver.net/p/disney-erased-fivethirtyeight
4•7777777phil•11m ago•0 comments

Which campaigns actually drive your leads?

https://www.digitalpilot.app/
1•iamjeylabrecque•13m ago•0 comments

Show HN: Coding agent where a second agent QAs every PR in a real browser

https://www.notesasm.com/
1•kavin_key•13m ago•0 comments

The missing men of the American marriage market

https://www.npr.org/sections/planet-money/2026/05/19/g-s1-122695/the-missing-men-of-the-american-...
2•sizzle•14m ago•0 comments

Scientists worried about de-extinction ethics as biotech co. touts breakthrough

https://www.rnz.co.nz/news/science-and-technology/595719/scientists-concerned-about-de-extinction...
3•billybuckwheat•14m ago•0 comments

Automate your computer using real code – not drag-and-drop blocks

https://github.com/hassananayi/codeonix
1•hassananayi•14m ago•1 comments

The Trouble with Emotion AI

https://www.computerworld.com/article/4171382/the-trouble-with-emotion-reading-ai.html
2•mikelgan•14m ago•1 comments

Lapdog: Local Coding Agent Assistant

https://lapdog.datadoghq.com/
1•astuyvenberg•16m ago•1 comments

Ruby vs. Java vs. TypeScript: Building Claude Cowork Docx Plugin

https://tanin.nanakorn.com/ruby-java-typescrip-claude-docx-plugin/
2•tanin•16m ago•0 comments

Mistral AI Python package compromised on PyPI [2026-05-12]

https://github.com/mistralai/client-python/issues/523
2•r2vcap•16m ago•0 comments

Finding Unpinned and Unpinnable GitHub Actions Across Your Org

https://www.pavel.gr/blog/finding-unpinned-and-unpinnable-github-actions
1•howlett•18m ago•0 comments

From Compute Overhang to Compute Crunch

https://secondthoughts.ai/p/the-ai-race
1•speckx•18m ago•0 comments

Chrome Dev Blog: Declarative Partial Updates (Interleaved HTML Streaming)

https://bsky.app/profile/did:plc:ilj6i6evo5xxl5iixp2y76nt/post/3mm7rxrubqs2v
1•avarev•19m ago•0 comments

Show HN: Search 67K .AI domains by AI-extracted tags and descriptions

https://ratemyaisite.com/explore
1•prolly97•21m ago•0 comments

Gemini Omni Flash is coming soon

https://gemini-omni-flash.net/
1•Jenny249•22m ago•0 comments

A case against the case against full-body MRI screening

https://medium.com/the-tideline/why-the-smartest-people-i-know-are-ignoring-their-doctors-on-full...
1•biancaleeman•22m ago•1 comments

TinyFish Vault: Your Web Agent Can Now Log in Without Touching Your Passwords

https://www.tinyfish.ai/blog/tinyfish-vault-your-web-agent-can-now-log-in-without-touching-your-p...
1•gargigupta•22m ago•0 comments

AI slop is flooding maths YouTube [video]

https://www.youtube.com/watch?v=mRO_QonhC2c
5•Imustaskforhelp•25m ago•1 comments

Google pushes update to Antigravity instead it reinstalls and locks everyone out

https://twitter.com/antigravity/status/2056795168326754759
4•thekevan•26m ago•2 comments

The TTY Demystified (2008)

https://www.linusakesson.net/programming/tty/index.php
2•20after4•27m ago•0 comments