frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

An AI model that can read and diagnose a brain MRI in seconds

https://www.michiganmedicine.org/health-lab/ai-model-can-read-and-diagnose-brain-mri-seconds
1•hhs•1m ago•0 comments

Dev with 5 of experience switched to Rails, what should I be careful about?

1•vampiregrey•3m ago•0 comments

AlphaFace: High Fidelity and Real-Time Face Swapper Robust to Facial Pose

https://arxiv.org/abs/2601.16429
1•PaulHoule•4m ago•0 comments

Scientists discover “levitating” time crystals that you can hold in your hand

https://www.nyu.edu/about/news-publications/news/2026/february/scientists-discover--levitating--t...
1•hhs•6m ago•0 comments

Rammstein – Deutschland (C64 Cover, Real SID, 8-bit – 2019) [video]

https://www.youtube.com/watch?v=3VReIuv1GFo
1•erickhill•7m ago•0 comments

Tell HN: Yet Another Round of Zendesk Spam

1•Philpax•7m ago•0 comments

Postgres Message Queue (PGMQ)

https://github.com/pgmq/pgmq
1•Lwrless•11m ago•0 comments

Show HN: Django-rclone: Database and media backups for Django, powered by rclone

https://github.com/kjnez/django-rclone
1•cui•13m ago•1 comments

NY lawmakers proposed statewide data center moratorium

https://www.niagara-gazette.com/news/local_news/ny-lawmakers-proposed-statewide-data-center-morat...
1•geox•15m ago•0 comments

OpenClaw AI chatbots are running amok – these scientists are listening in

https://www.nature.com/articles/d41586-026-00370-w
2•EA-3167•15m ago•0 comments

Show HN: AI agent forgets user preferences every session. This fixes it

https://www.pref0.com/
5•fliellerjulian•17m ago•0 comments

Introduce the Vouch/Denouncement Contribution Model

https://github.com/ghostty-org/ghostty/pull/10559
2•DustinEchoes•19m ago•0 comments

Show HN: SSHcode – Always-On Claude Code/OpenCode over Tailscale and Hetzner

https://github.com/sultanvaliyev/sshcode
1•sultanvaliyev•20m ago•0 comments

Microsoft appointed a quality czar. He has no direct reports and no budget

https://jpcaparas.medium.com/microsoft-appointed-a-quality-czar-he-has-no-direct-reports-and-no-b...
2•RickJWagner•21m ago•0 comments

Multi-agent coordination on Claude Code: 8 production pain points and patterns

https://gist.github.com/sigalovskinick/6cc1cef061f76b7edd198e0ebc863397
1•nikolasi•22m ago•0 comments

Washington Post CEO Will Lewis Steps Down After Stormy Tenure

https://www.nytimes.com/2026/02/07/technology/washington-post-will-lewis.html
8•jbegley•22m ago•1 comments

DevXT – Building the Future with AI That Acts

https://devxt.com
2•superpecmuscles•23m ago•4 comments

A Minimal OpenClaw Built with the OpenCode SDK

https://github.com/CefBoud/MonClaw
1•cefboud•23m ago•0 comments

The silent death of Good Code

https://amit.prasad.me/blog/rip-good-code
3•amitprasad•24m ago•0 comments

The Internal Negotiation You Have When Your Heart Rate Gets Uncomfortable

https://www.vo2maxpro.com/blog/internal-negotiation-heart-rate
1•GoodluckH•25m ago•0 comments

Show HN: Glance – Fast CSV inspection for the terminal (SIMD-accelerated)

https://github.com/AveryClapp/glance
2•AveryClapp•26m ago•0 comments

Busy for the Next Fifty to Sixty Bud

https://pestlemortar.substack.com/p/busy-for-the-next-fifty-to-sixty-had-all-my-money-in-bitcoin-...
1•mithradiumn•27m ago•0 comments

Imperative

https://pestlemortar.substack.com/p/imperative
1•mithradiumn•28m ago•0 comments

Show HN: I decomposed 87 tasks to find where AI agents structurally collapse

https://github.com/XxCotHGxX/Instruction_Entropy
2•XxCotHGxX•32m ago•1 comments

I went back to Linux and it was a mistake

https://www.theverge.com/report/875077/linux-was-a-mistake
3•timpera•33m ago•1 comments

Octrafic – open-source AI-assisted API testing from the CLI

https://github.com/Octrafic/octrafic-cli
1•mbadyl•34m ago•1 comments

US Accuses China of Secret Nuclear Testing

https://www.reuters.com/world/china/trump-has-been-clear-wanting-new-nuclear-arms-control-treaty-...
3•jandrewrogers•35m ago•2 comments

Peacock. A New Programming Language

2•hashhooshy•40m ago•1 comments

A postcard arrived: 'If you're reading this I'm dead, and I really liked you'

https://www.washingtonpost.com/lifestyle/2026/02/07/postcard-death-teacher-glickman/
4•bookofjoe•41m ago•1 comments

What to know about the software selloff

https://www.morningstar.com/markets/what-know-about-software-stock-selloff
2•RickJWagner•45m ago•0 comments
Open in hackernews

Show HN: HalluMix – A Benchmark for Real-World LLM Hallucination Detection

https://huggingface.co/blog/quotientai/hallumix
4•jneagu•9mo ago
We built HalluMix to evaluate how well hallucination detectors perform in the kinds of messy, high-stakes environments where LLMs are actually deployed: long-form outputs, multi-document contexts, and domain-specific tasks like law, medicine, science, and news.

Most existing benchmarks focus on synthetic or short-form QA data. That didn’t reflect what we were seeing in production, so we built our own to test our hallucination detectors, and decided to open source it.

The dataset includes 6,500 examples across QA, summarization, and NLI tasks. We added distractor documents, shuffled the context, and removed assumptions about format (like requiring a question) to better reflect real-world conditions.

We ran 7 detection systems on it, both open-source models and commercial APIs. While some performed well on shorter examples, even the best struggled with long-form content and multi-document grounding -- precisely where hallucinations tend to be most harmful.

Would love feedback, especially from anyone working on evals, hallucination detection, or RAG.

Links: – HF Dataset: https://huggingface.co/datasets/quotient-ai/hallumix - HF Blog: https://huggingface.co/blog/quotientai/hallumix – Internal Blog: https://blog.quotientai.co/introducing-hallumix-a-task-agnos... – Paper: https://arxiv.org/abs/2505.00506