> Important note for Claude Code users: Claude Code already handles prompt caching automatically for its own API calls — system prompts, tool definitions, and conversation history are cached out of the box.
Source: their GitHub
From the FAQ:
You're right, and it's a fair question. Claude Code does handle prompt caching automatically for its own API calls — system prompts, tool definitions, and conversation history are cached out of the box. You don't need this plugin for that.
This plugin is for a different layer: when you build your own apps or agents with the Anthropic SDK. Raw SDK calls don't get automatic caching unless you place cache_control breakpoints yourself. This plugin does that automatically, plus gives you visibility into what's being cached, hit rates, and real savings — which Claude Code doesn't expose.
> Claude Code already handles prompt caching automatically for its own API calls
Claude Code is an app. The API layer is different.
When did people start thinking that the Claude Code app and the API are the same thing?
Are these just all confused vibe coders?
Also the anthropic API did already introduce prompt-caching https://platform.claude.com/docs/en/build-with-claude/prompt...
What is new here?
> "Hasn't Anthropic's new auto-caching feature solved this?"
> Largely, yes — Anthropic's automatic caching (passing "cache_control": {"type": "ephemeral"} at the top level) handles breakpoint placement automatically now. This plugin predates that feature and originally filled that gap.
Domain Name: prompt-caching.ai
Updated Date: 2026-03-12T20:31:44Z
Creation Date: 2026-03-12T20:27:35Z
Registry Expiry Date: 2028-03-12T20:27:35Z
spiderfarmer•1h ago
stingraycharles•1h ago
As a matter of fact, i think this is not a problem at all as Anthropic makes it extremely easy to cache stuff; you just set your preferred cache level on the last message, and Anthropic will automatically cache it under the hood. Every distinct message is another “cache” point, eg they first compute the hash of all messages, if not found, compute the hash of all messages - 1, etc.
It’s really a non problem.
ermis•1h ago
stingraycharles•15m ago
I am so confused why you chose an MCP server to solve this, wouldn’t a regular API at least have some merit in how it could be used (in that it doesnt require an LLM to invoke it) ?