frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

The Faust Baseline as a Filtering Layer

https://www.intelligent-people.org/2025/12/13/the-faust-baseline-as-a-filtering-layer/
1•micvicfaust9•35s ago•0 comments

Error-Handling and Locality

https://www.natemeyvis.com/error-handling-and-locality/
1•Theaetetus•52s ago•0 comments

Petition for David Sacks to Self-Deport

https://form.jotform.com/253464131055147
1•resters•1m ago•0 comments

Get found where people search today

https://kleonotus.com/
1•makenotesfast•3m ago•1 comments

Show HN: An early-warning system for SaaS churn (not another dashboard)

https://firstdistro.com
1•Jide_Lambo•4m ago•0 comments

Tell HN: Musk has never *tweeted* a guess for real identity of Satoshi Nakamoto

1•tokenmemory•4m ago•0 comments

A Practical Approach to Verifying Code at Scale

https://alignment.openai.com/scaling-code-verification/
1•gmays•6m ago•0 comments

Show HN: macOS tool to restore window layouts

https://github.com/zembutsu/tsubame
1•zembutsu•8m ago•0 comments

30 Years of <Br> Tags

https://www.artmann.co/articles/30-years-of-br-tags
1•FragrantRiver•15m ago•0 comments

Kyoto

https://github.com/stevepeak/kyoto
2•handfuloflight•16m ago•0 comments

Decision Support System for Wind Farm Maintenance Using Robotic Agents

https://www.mdpi.com/2571-5577/8/6/190
1•PaulHoule•17m ago•0 comments

Show HN: X-AnyLabeling – An open-source multimodal annotation ecosystem for CV

https://github.com/CVHub520/X-AnyLabeling
1•CVHub520•19m ago•0 comments

Penpot Docker Extension

https://www.ajeetraina.com/introducing-the-penpot-docker-extension-one-click-deployment-for-self-...
1•rainasajeet•20m ago•0 comments

Company Thinks It Can Power AI Data Centers with Supersonic Jet Engines

https://www.extremetech.com/science/this-company-thinks-it-can-power-ai-data-centers-with-superso...
1•vanburen•23m ago•0 comments

If AIs can feel pain, what is our responsibility towards them?

https://aeon.co/essays/if-ais-can-feel-pain-what-is-our-responsibility-towards-them
3•rwmj•27m ago•5 comments

Elon Musk's xAI Sues Apple and OpenAI over App Store Drama

https://mashable.com/article/elon-musk-xai-lawsuit-apple-openai
1•paulatreides•30m ago•1 comments

Ask HN: Build it yourself SWE blogs?

1•bawis•30m ago•1 comments

Original Apollo 11 Guidance Computer source code

https://github.com/chrislgarry/Apollo-11
3•Fiveplus•36m ago•0 comments

How Did the CIA Lose Nuclear Device?

https://www.nytimes.com/interactive/2025/12/13/world/asia/cia-nuclear-device-himalayas-nanda-devi...
1•Wonnk13•36m ago•0 comments

Is vibe coding the new gateway to technical debt?

https://www.infoworld.com/article/4098925/is-vibe-coding-the-new-gateway-to-technical-debt.html
1•birdculture•40m ago•1 comments

Why Rust for Embedded Systems? (and Why I'm Teaching Robotics with It)

https://blog.ravven.dev/blog/why-rust-for-embedded-systems/
2•aeyonblack•42m ago•0 comments

EU: Protecting children without the privacy nightmare of Digital IDs

https://democrats.eu/en/protecting-minors-online-without-violating-privacy-is-possible/
3•valkrieco•42m ago•0 comments

Using E2E Tests as Documentation

https://www.vaslabs.io/post/using-e2e-tests-as-documentation
1•lihaoyi•43m ago•0 comments

Apple Welcome Screen: iWeb

https://www.apple.com/welcomescreen/ilife/iweb-3/
1•hackerbeat•44m ago•1 comments

Accessible Perceptual Contrast Algorithm (APCA) in a Nutshell

https://git.apcacontrast.com/documentation/APCA_in_a_Nutshell.html
1•Kerrick•45m ago•0 comments

AI agent finds more security flaws than human hackers at Stanford

https://scienceclock.com/ai-agent-beats-human-hackers-in-stanford-cybersecurity-experiment/
3•ashishgupta2209•47m ago•2 comments

Nano banana prompts, updates everyday

https://github.com/fionalee1412/bestnanobananaprompt-github
4•AI_kid1412•50m ago•0 comments

Skills vs. Dynamic MCP Loadouts

https://lucumr.pocoo.org/2025/12/13/skills-vs-mcp/
3•cube2222•54m ago•0 comments

Top validated AI-SaaS Ideas are available here

1•peterbricks•58m ago•0 comments

UnmaskIP: A Clean, Ad-Free IP and Deep Packet Leak Checker

https://unmaskip.net
1•kfwkwefwef•1h ago•0 comments
Open in hackernews

An open source common knowledge and context based Hallucination Detection Model

https://huggingface.co/AimonLabs/hallucination-detection-model
5•pjoshi30•7mo ago

Comments

pjoshi30•7mo ago
In a typical LLM application, the output from the LLM can be either from the context provided to the LLM or from the LLM's pre-trained data or from the data that the LLM was fine-tuned with. LLM's can provide incorrect results from all of these different sources of data. This is popularly known as "LLM hallucinations".

We built a model that can identify hallucinations from both a closed-book (internal data that was "learnt" by the LLM as part of the pre-training or fine-tuning process) and an open-book setting (data explicitly provided to the LLM as part of its context). The model is able to provide phrase level attribution. We open sourced the model (available on HuggingFace) and a new benchmark dataset that can be used to check how well any Judge performs on the hallucination detection task.

Try the Google Collab notebook linked in the model card.

devvratbhardwaj•7mo ago
This is a much-needed direction, especially as LLMs are increasingly used in high-stakes settings. Phrase-level attribution for hallucination detection has real potential to improve transparency and trust in model outputs. The fact that both the model and benchmark are open source makes it even more valuable—enabling reproducibility and inviting broader collaboration.
pjoshi30•7mo ago
Absolutely! Check out the benchmark here: https://huggingface.co/datasets/AimonLabs/HDM-Bench You can use it to check how well your LLM judges perform on the hallucination detection task.