We’re the team behind Fiddler, and we recently built an Agent Cache feature.
It works by capturing an LLM request/response pair and, once caching is enabled, serving subsequent matching calls locally so duplicate requests never hit the provider.
It’s meant to reduce three pains that show up together in agent development: token cost, feedback-loop latency, and output non-determinism.
We’d love to hear your thoughts.