frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•12mo ago

Comments

tocs3•12mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

We built a zero-telemetry Native Rust AI engine. (Ghost Lock included)

https://github.com/jrabbass/-esai-community-edition
1•jrabbass•33s ago•0 comments

Show HN: QuantTakeoff – Construction PDFs to takeoff and 3D scene

1•acaciabengo•3m ago•0 comments

Ane: CLI editor that uses LSPs to let agents explore/edit code with fewer tokens

https://github.com/prettysmartdev/ane
2•archnet•4m ago•0 comments

AI's eyes to help with component inspections

https://newsreleases.sandia.gov/ai-inspections/
1•hhs•4m ago•0 comments

Show HN: Where do I stand? – Household Health

https://wheredoistand.me/
1•vgrocha•7m ago•0 comments

Singapore Foreign Minister's Keynote at AI Engineer Singapore

https://www.youtube.com/watch?v=t-4a20_iYhg
1•doppp•9m ago•0 comments

White monkeys to make Chinese business look more global

https://www.theguardian.com/lifeandstyle/2026/may/16/fake-lawyers-scientists-chefs-punters-white-...
1•andsoitis•9m ago•0 comments

The mysterious disappearance of growth in US manufacturing: Was it China shock?

https://www.aeaweb.org/articles?id=10.1257/pandp.20261041
1•hhs•20m ago•0 comments

A Nicer Voltmeter Clock

https://lcamtuf.substack.com/p/a-nicer-voltmeter-clock
2•surprisetalk•26m ago•0 comments

'Transported' book review: Lost in a musical daydream

https://www.wsj.com/arts-culture/books/transported-review-lost-in-a-musical-daydream-83d8f76d
1•hhs•27m ago•0 comments

AI Memory Reader – Native macOS app for browsing Claude Code memory files

https://github.com/nvwalj/ai-memory-reader
2•nvwalj•30m ago•0 comments

The Futility of Lava Lamps: What Random Means

https://loup-vaillant.fr/articles/lava-lamps-and-randomness
1•birdculture•31m ago•0 comments

Living with Class

https://philosophersmag.com/living-with-class/
1•Wicher•35m ago•0 comments

Adonis was Sumerian before he was Greek

https://storica.club/blog/adonis-was-sumerian/
5•aralsamuel•35m ago•0 comments

Token spend breaks budgets – what next?

https://newsletter.pragmaticengineer.com/p/the-pulse-token-spend-breaks-budgets
3•eneveu•39m ago•1 comments

Wish You Were Her

https://www.nplusonemag.com/issue-53/essays/wish-you-were-her/
3•gmays•39m ago•0 comments

Hacker's Manual 2025 error at page 29

2•eahm•41m ago•0 comments

Mecha Comet's April Voyage – Open Modular Handheld on mainline Linux 7.0 kernel

https://mecha.so/blog/the-comets-april-voyage
1•walterbell•42m ago•0 comments

LeetCode Token Golf – Training for the interviews that matter

https://github.com/whitecell-dev/LeetCode-Token-Golf
2•MaykonMan•44m ago•0 comments

My -Tech

https://fingolas.eu/MyTech/
1•doener•45m ago•1 comments

A checkbox to enable the Django debug toolbar

https://mdk.fr/blog/django-debug-toolbar-checkbox.html
1•julienpalard•46m ago•1 comments

MCP Hello Page

https://www.hybridlogic.co.uk/blog/2026/05/mcp-hello-page
19•Dachande663•46m ago•9 comments

Zerostack – A Unix-inspired coding agent written in pure Rust

https://crates.io/crates/zerostack/1.0.0
24•gidellav•48m ago•0 comments

Jane Street Designed Its New Data Center: A Tour with Dwarkesh Patel [video]

https://www.youtube.com/watch?v=8J-GUnfSqeE
3•canarymark•49m ago•0 comments

Zerostack – Tiny Rust Coding Agent in 8MB of RAM

https://github.com/gi-dellav/zerostack/tree/main
3•gidellav•50m ago•0 comments

Steve Blank: Secret History of Silicon Valley (2008) [video]

https://www.youtube.com/watch?v=ZTC_RxWN_xo
1•stmw•52m ago•0 comments

Taiwan-Starlink service talks fall through over regulatory issues

https://www.taipeitimes.com/News/taiwan/archives/2026/05/17/2003857478
2•aa_is_op•52m ago•0 comments

Iran's Seizure of Chinese Security Ship Shows Its Favors for Friends Have Limits

https://www.wsj.com/world/china/irans-seizure-of-chinese-security-ship-shows-its-favors-for-frien...
2•JumpCrisscross•56m ago•0 comments

I tried to make Claude make me money on Algora bounties (data and tool)

https://github.com/ztc00/algora-scout/blob/main/POST.md
11•ztc00•58m ago•2 comments

Samsung is developing nearline SSDs up to 1 PB

https://www.blocksandfiles.com/flash/2026/05/15/scality-says-samsung-is-developing-nearline-ssds-...
3•ziofill•58m ago•0 comments