frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Binary Retrieval-Augmented Reward Mitigates Hallucinations

https://arxiv.org/abs/2510.17733
22•MarlonPro•4h ago

Comments

amflare•1h ago
> Existing mitigation approaches often degrade performance on open-ended generation and downstream tasks, limiting their practical utility. [...] Unlike continuous reward schemes, our approach assigns a reward of one only when the model's output is entirely factually correct, and zero otherwise.

Someone correct me if I am wrong, as I'm am on the very edge of this space looking in, but does this mean that they are using a "degraded performance with fewer hallucinations" model to fact check the "more powerful yet prone to hallucinations" model?

svnt•1h ago
Also on the edge, but it appears they are relying on the search-augmented identification of conflicts in the generated statement, which is an easier task than constructing an answer to the question. It also encourages abstention because there are no conflicts in “I don’t know” (so “mitigating hallucinations” and “answering more questions correctly” are not necessarily the same thing)
mNovak•1h ago
My understanding is no, they are collecting a cache of documents from the training set, then after pre-training prompt about those topics. A separate verifier is given both the relevant source documents and generated response, and tasked with checking for conflicts in factuality.

They describe using Qwen 32B as the verifier, and the model under training is Qwen 8B. So in fact the verifier is beefier than the trainee model, though it's unclear if that has to be the case as you scale up.

Build Your Own Database

https://www.nan.fyi/database
212•nansdotio•4h ago•48 comments

LLMs can get "brain rot"

https://llm-brain-rot.github.io/
202•tamnd•6h ago•108 comments

Neural audio codecs: how to get audio into LLMs

https://kyutai.org/next/codec-explainer
289•karimf•7h ago•88 comments

Doomsday Scoreboard

https://doomsday.march1studios.com/
6•diymaker•19m ago•1 comments

Do not accept terms and conditions

https://www.termsandconditions.game/
52•halflife•4d ago•38 comments

Foreign hackers breached a US nuclear weapons plant via SharePoint flaws

https://www.csoonline.com/article/4074962/foreign-hackers-breached-a-us-nuclear-weapons-plant-via...
229•zdw•4h ago•135 comments

NASA chief suggests SpaceX may be booted from moon mission

https://www.cnn.com/2025/10/20/science/nasa-spacex-moon-landing-contract-sean-duffy
89•voxleone•7h ago•358 comments

Show HN: Katakate – Dozens of VMs per node for safe code exec

https://github.com/Katakate/k7
65•gbxk•5h ago•25 comments

Mathematicians have found a hidden 'reset button' for undoing rotation

https://www.newscientist.com/article/2499647-mathematicians-have-found-a-hidden-reset-button-for-...
42•mikhael•5d ago•26 comments

Wikipedia says traffic is falling due to AI search summaries and social video

https://techcrunch.com/2025/10/18/wikipedia-says-traffic-is-falling-due-to-ai-search-summaries-an...
130•gmays•19h ago•136 comments

Our modular, high-performance Merkle Tree library for Rust

https://github.com/bilinearlabs/rs-merkle-tree
103•bibiver•7h ago•25 comments

Ilo – a Forth system running on UEFI

https://asciinema.org/a/Lbxa2w9R5IbaJqW3INqVrbX8E
92•rickcarlino•7h ago•30 comments

Flexport Is Hiring SDRs in Chicago

https://job-boards.greenhouse.io/flexport/jobs/5690976?gh_jid=5690976
1•thedogeye•3h ago

Magit Is Amazing

https://heiwiper.com/posts/magit-is-awesome/
76•Bogdanp•1h ago•43 comments

Getting DeepSeek-OCR working on an Nvidia Spark via brute force with Claude Code

https://simonwillison.net/2025/Oct/20/deepseek-ocr-claude-code/
64•simonw•1d ago•3 comments

Minds, brains, and programs (1980) [pdf]

https://home.csulb.edu/~cwallis/382/readings/482/searle.minds.brains.programs.bbs.1980.pdf
9•measurablefunc•1w ago•0 comments

Diamond Thermal Conductivity: A New Era in Chip Cooling

https://spectrum.ieee.org/diamond-thermal-conductivity
128•rbanffy•9h ago•41 comments

Time to build a GPU OS? Here is the first step

https://www.notion.so/yifanqiao/Solve-the-GPU-Cost-Crisis-with-kvcached-289da9d1f4d68034b17bf2774...
37•Jrxing•3h ago•5 comments

Binary Retrieval-Augmented Reward Mitigates Hallucinations

https://arxiv.org/abs/2510.17733
22•MarlonPro•4h ago•3 comments

ChatGPT Atlas

https://chatgpt.com/atlas
376•easton•3h ago•394 comments

The Programmer Identity Crisis

https://hojberg.xyz/the-programmer-identity-crisis/
109•imasl42•3h ago•122 comments

StarGrid: A new Palm OS strategy game

https://quarters.captaintouch.com/blog/posts/2025-10-21-stargrid-has-arrived,-a-brand-new-palm-os...
179•capitain•8h ago•38 comments

AWS multiple services outage in us-east-1

https://health.aws.amazon.com/health/status?ts=20251020
2193•kondro•1d ago•1990 comments

Apple alerts exploit developer that his iPhone was targeted with gov spyware

https://techcrunch.com/2025/10/21/apple-alerts-exploit-developer-that-his-iphone-was-targeted-wit...
191•speckx•4h ago•94 comments

What do we do if SETI is successful?

https://www.universetoday.com/articles/what-do-we-do-if-seti-is-successful
72•leephillips•1d ago•78 comments

Show HN: ASCII Automata

https://hlnet.neocities.org/ascii-automata/
69•california-og•4d ago•8 comments

60k kids have avoided peanut allergies due to 2015 advice, study finds

https://www.cbsnews.com/news/peanut-allergies-60000-kids-avoided-2015-advice/
208•zdw•16h ago•210 comments

Show HN: bbcli – A TUI and CLI to browse BBC News like a hacker

https://github.com/hako/bbcli
33•wesleyhill•2d ago•4 comments

The death of thread per core

https://buttondown.com/jaffray/archive/the-death-of-thread-per-core/
38•ibobev•23h ago•7 comments

RF Shielding History: When the FCC Cracked Down on Computers

https://tedium.co/2025/10/20/computers-fcc-rf-interference-history/
45•shortformblog•6h ago•28 comments