I’m building Lenzy AI - probably the first product analytics platform for AI agents.
From my research:
Companies building AI agents have thousands or even millions of conversations. In these, users express what they need, use, love, or hate. Often long before they reach out to support (or churn).
Some teams try to read chats manually, some build in-house pipelines to analyze them, others completely miss out on this data.
The idea:
Lenzy continuously analyzes conversations users have with your AI agents to:
1. Discover missing features (e.g. "Fetch info from a URL" mentioned 42 times this week)
2. Spot churn signals (e.g., “Bob’s requests were fulfilled in 35% of chats. Frustration is high”)
3. Detect chats that require human review.
4. Track user satisfaction and task completion rates.
5. Surface any custom insight (e.g., “Top topics with support?”, “Most used features?”)
Existing solutions:
I reviewed about 10 analytical tools like Langfuse, Helicone, and Braintrust. They all seem to focus on evaluating individual LLM calls, not full conversations. Some are starting to move toward multi-turn evals, which suggests demand but they look constrained by data models built around single-call evaluations. For example, in Langfuse you’d need the entire conversation in one LLM call to analyze it (N+1 evaluations).
Where I’m at:
It’s been three weeks since the idea’s inception. I’ve interviewed 12 founders shipping AI agents and developed the product concept.
Now, I’m resisting the urge to build until I find the right design partners to build this with.
If you’ve shipped an AI agent with 100+ daily conversations and aren’t analyzing them yet, consider becoming a Lenzy design partner at https://lenzy.ai. You set the price. I build for your needs. Premium support forever.
Would love your feedback! What am I missing? Should I make it open source?
BohdanPetryshyn•2h ago
That makes me wonder if there’s a pitfall I haven’t discovered yet.
Is there a fundamental reason this hasn’t been built before?