frontpage.

Hi HN, I’m Tina Over the past months I’ve been frustrated with how often LLMs hallucinate — they generate answers with high confidence, but sometimes the information simply doesn’t exist. For teams relying on AI (content, research, finance, even legal), this can be a serious problem. So I’ve been working on CompareGPT, a tool designed to make LLM outputs more trustworthy. Key features: Confidence scoring → surfaces how reliable an answer is. Source validation → checks whether data can be backed by references. Side-by-side comparison → run the same query across multiple models and see consistency. You can try it here: https://test.comparegpt.io/home Right now it works best in knowledge-based queries (finance, law, science), and we’re still improving edge cases. I’d love to hear your thoughts, feedback, or even brutal criticism from this community. Thanks for reading!

Show HN: A Tetris finesse inspired Vim game

Marco – All your emails, one place

Search Index in 150 Lines of Haskell

Show HN: AddVenture – a minimal, fast mental-math game

Australian lawyer penalised for using AI-generated false citations

What's the best way to manage and access multiple PostgreSQL instances in K8s?

LLM-eval-simple a simple way to evaluate LLM for your use case

I code as a blind and neurodivergent developer

Benchmark for Local LLMs with German "Who Wants to Be a Millionaire" Questions

AI is going great for the blind

What Football Coaching Taught Me About Software Development

Ask HN: What are you working on? (Aug 2025)

Show HN: I built a Chrome extension to add tags and notes on LinkedIn

Dagen H – The Day All of Sweden Switched Driving – 58th Anniversary

OpenKagi – Custom Lenses and Themes for Kagi Search

Kernel-hack-drill and exploiting CVE-2024-50264 in the Linux kernel

Yogic Tapas

Show HN: I built a Cold DM tool (450 DMs/day)

Photon surfaces extensions for dynamical gravitational collapse

Apple Knowledge Navigator Video (1987)

Show HN: I built an AI that uses a metacognitive loop 2 solve invention problems

Shop Sages

WheelNext and Wheel Variants: An update, and a request for feedback

China has mandated a digital watermark for all AI-generated content

What are your biggest struggles with creating consistent content?

Gitpod is now Ona, moving beyond the IDE

Google Chrome at 17 – A history of our browser

Too much Venture Capital is a disease for startups

The IBM Selectric

Building AI Review Article Agent: What I Learned About Automated Knowledge Work

Show HN: CompareGPT – Making LLMs More Trustworthy by Reducing Hallucinations