frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•7mo ago

Comments

tocs3•7mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Malicious Chrome Extensions "Phantom Shuttle" Masquerade as a VPN to Intercept

https://socket.dev/blog/malicious-chrome-extensions-phantom-shuttle
1•feross•2m ago•0 comments

Scientists Map the Human Genome in 4D

https://news.feinberg.northwestern.edu/2025/12/22/scientists-map-the-human-genome-in-4d/
1•geox•2m ago•0 comments

Bloom: An open source tool for automated behavioral evaluations

https://alignment.anthropic.com/2025/bloom-auto-evals/
1•sonabinu•3m ago•0 comments

Your Year with ChatGPT

https://www.chatgpt.com/?q=YourYearWithChatGPT
1•FergusArgyll•5m ago•0 comments

DOJ uploaded a 12-SEC video showing Epstein attempting suicide

https://twitter.com/rtwlz/status/2003211685650374823
2•dvrp•5m ago•1 comments

Bias Is Ruining Your Life (Here's Why) [video]

https://www.youtube.com/watch?v=QFqUSYTylFU
1•saltysalt•7m ago•0 comments

Brimar thermionic products great British valve project

https://brimaruk.com/menugbvp/about-the-gbvp/
1•fanf2•7m ago•0 comments

Boys at her school shared AI-generated, nude images of her. She was expelled

https://abcnews.go.com/US/wireStory/boys-school-shared-ai-generated-nude-images-after-128611202
3•randycupertino•9m ago•0 comments

Rational and Irrational Belief in the Hot Hand: Evidence from "Jeopardy "

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5062536
1•PaulHoule•10m ago•0 comments

Comprehensive Migration Guide for Ingress Nginx Controller Retirement

https://ingressnginxmigration.org/
1•simjue•12m ago•0 comments

Passkeys Explained [video]

https://www.youtube.com/watch?v=xYfiOnufBSk
1•jonbaer•12m ago•0 comments

See sunrise and sunset lines overlaid on any street map

https://www.suncalc.org
1•robinwarren•13m ago•0 comments

DACs and ADCs, or there and back again

https://lcamtuf.substack.com/p/dacs-and-adcs-or-there-and-back-again
1•weinzierl•13m ago•0 comments

Blade Runner: Special Photographic Effects (2020)

https://theasc.com/articles/blade-runner-photographic-effects
1•exvi•13m ago•0 comments

The Ritual of the Deploy (2021)

https://vickiboykis.com/2021/06/20/the-ritual-of-the-deploy/
1•wonger_•14m ago•0 comments

Blade Runner: Set Design (2020)

https://theasc.com/articles/blade-runner-set-design
1•exvi•14m ago•0 comments

I'm tired of Hacker News slop

https://blog.absurdpirate.com/im-tired-of-hacker-news-slop/
6•speckx•16m ago•0 comments

2026 Observability Predictions – Part 9

https://www.apmdigest.com/2026-observability-predictions-9
1•gpi•16m ago•0 comments

Flipper Zero and Raspberry Pi Banned from NYC Mayoral Inauguration

https://www.transition2025.com/inauguration
4•MisterTea•16m ago•1 comments

Windows 11 hack: Higher SSD speeds with new Microsoft NVMe driver

https://www.notebookcheck.net/Windows-11-hack-Higher-SSD-speeds-with-new-Microsoft-NVMe-driver.11...
4•akyuu•20m ago•0 comments

Coding Agent Is a Slot Machine

https://blog.kvit.app/posts/variance-claude-vibe/
2•skolos•21m ago•0 comments

Sun's gravitational lens could reveal alien planets' surfaces – Science – AAAS

https://www.science.org/content/article/sun-s-gravitational-lens-could-reveal-alien-planets-surfaces
3•rbanffy•22m ago•0 comments

Setting Up the AWS SDK for Rust

https://rup12.net/posts/learning-rust-configuring-the-aws-sdk/
2•ruptwelve•27m ago•0 comments

List of Programmers

https://en.wikipedia.org/wiki/List_of_programmers
2•andsoitis•30m ago•2 comments

How to Submit a ChatGPT App

https://www.adspirer.com/blog/how-to-submit-chatgpt-app
2•amekala•32m ago•0 comments

AI Feynman: A physics-inspired method for symbolic regression (2020)

https://www.science.org/doi/pdf/10.1126/sciadv.aay2631
2•lisper•33m ago•0 comments

The Comprehensive Cognition Blog

https://mateolafalce.github.io/
3•lafalce•34m ago•0 comments

Blasts from the past: The Soviet ape-man scandal (2008)

https://www.newscientist.com/article/mg19926701-000-blasts-from-the-past-the-soviet-ape-man-scandal/
2•cwwc•38m ago•0 comments

Call of Duty Co-Creator and EA Executive Vince Zampella Killed in Car Accident

https://www.ign.com/articles/call-of-duty-co-creator-respawn-co-founder-and-ea-executive-vince-za...
6•andsoitis•39m ago•1 comments

Qwen-Image-Layered: Layered Decomposition for Inherent Editablity

https://github.com/QwenLM/Qwen-Image-Layered
2•_____k•39m ago•0 comments