That's why we built Corepoints. You can create and send OAs where AI usage (with AI chat) is a core metric.
You have full control of the testing environment: hallucinations, data leakages, LLM behavior + Grade candidates on aspects such as their answer accuracy (of course), prompting quality, reasoning quality, hallucination susceptibility, token usage, and more.
We're currently doing a demo/beta run for about the next month or so that we can iterate off feedback, no payment/commitment required that we can iterate off feedback.
You can request an account here: https://app.corepoints.ai/request-access
All comments, suggestions, and feedback are welcome and much appreciated. You can also reach us at hello@corepoints.ai !!
Thanks everyone.