I’m launching zenllm.io, a read-only layer that correlates LLM requests → model → tokens → $ → service/team so you can answer “why did our bill spike?” quickly. It detects patterns like long prompts growing over time, retries, and agent/tool loops, then suggests cost/quality tradeoffs (e.g., model choice, context trimming).
No proxy/gateway required. Link: www.zenllm.io