frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

I analyzed 3 years of my ChatGPT usage (21,948 turns and Oura biometrics)

https://github.com/gsampaio-rh/3-years-of-chat-gpt
1•sampgab•9s ago•1 comments

An agentic workflow pattern for software development

https://medium.com/quantumblack/agentic-workflows-for-software-development-dc8e64f4a79d
1•stichers•48s ago•0 comments

Shaka Player – JavaScript player library / DASH and HLS client / MSE-EME player

https://github.com/shaka-project/shaka-player
1•javatuts•2m ago•0 comments

Measurement of lithium plume from the uncontrolled re-entry of a Falcon 9 rocket

https://www.nature.com/articles/s43247-025-03154-8
1•Breadmaker•2m ago•0 comments

TeXSmith: From Markdown to LaTeX in a Snap

https://yves-chevallier.github.io/texsmith/0.2.1/
1•nowox•2m ago•0 comments

Book Recommendations: Discworld

https://lyonhe.art/book-recommendations/discworld/
1•sohkamyung•4m ago•0 comments

Everyday Entropy

https://dailyscratchpad.substack.com/p/everyday-entropy
1•sujayk_33•5m ago•0 comments

Concerns with the UK's Digital Identity, VPN Restrictions, & Child Online Safety

https://ross-sec-audio.github.io//posts/Concerns-Regarding-Digital-Identity,-VPN-Restrictions,-an...
2•ross-sec-audio•5m ago•1 comments

Mercator: A modular swarm-dedicated robot platform

https://www.sciencedirect.com/science/article/pii/S2468067226000118
1•pppone•5m ago•0 comments

Mailing list for KC3 programming language

https://kmx.io/blog/introducing-mailing-list-for-kc3-programming-language
1•thodg•5m ago•0 comments

Image-JS: Image processing and manipulation in JavaScript

https://github.com/image-js/image-js
1•javatuts•6m ago•0 comments

Curated List of Personal Blogs

https://collection.mataroa.blog/
2•TigerUniversity•6m ago•0 comments

Three Directions in Design: Gerald Jay Sussman [video]

https://www.youtube.com/watch?v=Tdwr9tweTDE
1•tosh•6m ago•0 comments

Web Chatbots Should Just Be Assistive Technologies

https://lepisma.xyz/2026/02/18/web-chatbots-are-assistive-technologies/
1•lepisma•7m ago•0 comments

How Professional Gamblers Size Bets

https://hails.info/writing/kelly-criterion/
2•djrhails•7m ago•0 comments

A multi-entry CFG design conundrum

https://bernsteinbear.com/blog/multiple-entry/
1•surprisetalk•7m ago•0 comments

Postcardware.net

https://postcardware.net/
1•surprisetalk•8m ago•0 comments

Compare Prices by Difference, Not Factor

https://arizerner.com/posts/compare-prices-by-difference/
1•surprisetalk•8m ago•0 comments

Dante and the 3-Sphere

https://johncarlosbaez.wordpress.com/2026/01/18/dante-and-the-3-sphere/
1•surprisetalk•8m ago•0 comments

The Key to Longevity May Be Found Inside Our Cells

https://www.nytimes.com/2026/02/19/well/mitochondria-longevity-health.html
1•XzetaU8•8m ago•1 comments

Swift Import Declarations

https://nshipster.com/import/
1•punkpeye•8m ago•0 comments

NIST Launches Standards Initiative for AI Agents

https://www.nist.gov/artificial-intelligence/ai-agent-standards-initiative
1•benban•8m ago•1 comments

Comparing C/C++ unity build with regular build on a large codebase (2024)

https://hereket.com/posts/cpp-unity-compile-inkscape/
1•punkpeye•9m ago•0 comments

Google Suspended AI Users over Third-Party Tools

https://openclaw.rocks/blog/google-antigravity-ban
1•_____k•10m ago•0 comments

Redlining, or Value over Artifacts

https://tritium.legal/blog/redline
1•piker•10m ago•0 comments

Book Review: Vibe Coding by Gene Kim and Steve Yegge

https://mikehadlow.com/posts/2026-02-23-vibe-coding/
2•mikehadlow•10m ago•0 comments

Show HN: DarePhone – Customers update your website from live chat complaints

https://darephone.com/
1•jimhi•11m ago•1 comments

French DOJ consider that VLC is not professional work because done by volunteers

https://twitter.com/videolan/status/2025854512607449230
2•youz•11m ago•1 comments

Show HN:Panther – A Scripting Language Designed for Cybersecurity Workflows

1•CzaxTanmay•12m ago•0 comments

Show HN: Zero-allocation and SIMD-accelerated CSV iterator in Zig

https://github.com/peymanmortazavi/csv-zero
1•peymo•13m ago•0 comments