For an in-product AI assistant (with grounding, doc retrieval, and tool calling) I'm having a hard time wrapping my head around how to evaluate and monitor its success with customer interactions, prompt adherence, correctness and appropriateness, etc.
Any tips or resources that have been helpful to folks investing this challenge? Would love to learn. What does your stack / process look like?