Hi HN, I’m Tina
Over the past months I’ve been frustrated with how often LLMs hallucinate — they generate answers with high confidence, but sometimes the information simply doesn’t exist. For teams relying on AI (content, research, finance, even legal), this can be a serious problem.
So I’ve been working on CompareGPT, a tool designed to make LLM outputs more trustworthy. Key features:
Confidence scoring → surfaces how reliable an answer is.
Source validation → checks whether data can be backed by references.
Side-by-side comparison → run the same query across multiple models and see consistency.
You can try it here:
https://test.comparegpt.io/home
Right now it works best in knowledge-based queries (finance, law, science), and we’re still improving edge cases. I’d love to hear your thoughts, feedback, or even brutal criticism from this community.
Thanks for reading!