We ran a trace on our own MCP server and found 9 distinct friction points in a single 10-minute session. Things like undocumented enum constraints and case-sensitivity issues that never showed up as "errors" in our standard logs.
AgentVoice closes this loop by triangulating three sources: it runs an LLM observer over session transcripts to diagnose intent, pulls production telemetry to prove the scale of the failure, and adds a submit_feedback tool so agents can narrate their own friction in real-time.
I’d love to get feedback on the approach, especially from anyone currently shipping MCP servers or agent-facing APIs.