My idea was to create adaptive caching, which gives LLMs the fastest and cleanest path possible from cache to code so that they could continuously optimize cache. I have a caching subagent now that makes 2-3 PRs a day to make small cache optimizations and, in both projects, I see a hit rate of over 95%.
The name t87s is a numeronym for the oft-cited Phil Karlton quote, with a self-inflicted off-by-one error :)
The core of the idea is to decouple invalidations and cache keys by using message passing. This makes invalidation O(1) regardless of how many entries are affected.
This has a known race condition (invalidation fires between fetch and cache write, entry looks fresh forever), which the cloud adapter solves with Cloudflare Durable Objects -- a single-threaded coordinator per tenant that serializes tag reads/writes.
Entries live in KV for fast global reads and consistency is enforced through the DO. The downside is slightly slower reads, but you get atomic transactions that are usually prohibitive in caching systems & that prevent stale data in the cache plumbing itself.
10% of cache hits are silently re-fetched and compared (a verification canary). Every operation -- hits, misses, invalidations, verifications, "blast radius" as Opus likes to call it (I've started saying this IRL and talking like an LLM...) -- is logged to SQLite in the Analytics DO, queryable via raw SQL through an API endpoint. That's what my subagent uses. It queries miss rates, stale-verification frequency, and invalidation behavior, then traces problems back to the tag schema in the codebase and opens PRs.
There are TS and Python clients that are free to use with Redis and perform well. The adaptive caching is paid with a free tier. Cache long and prosper :)