frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

How Much Should We Spend on Scientific Replication?

https://ifp.org/how-much-should-we-spend-on-replication/
1•surprisetalk•1m ago•0 comments

Where's My Orbital Habitat?

https://asteriskmag.com/issues/12-books/wheres-my-orbital-habitat
1•surprisetalk•1m ago•0 comments

Disaster Costs, 1900–2024

https://entropicthoughts.com/disaster-costs-1900-to-2024
1•surprisetalk•1m ago•0 comments

Making a Victorian CSS Border: Dual

https://jacobfilipp.com/victorian-dual/
1•surprisetalk•1m ago•0 comments

USCIS will weight H-1B lottery by salary starting Feb 2026

https://theh1brecords.substack.com/p/analysis-of-517874-petitions-reveals
1•codebyaditya•2m ago•1 comments

Show HN: I'm building an API that finds signal in form responses

https://formtone.io
1•lukapg•2m ago•0 comments

Japan suspends restart of biggest nuclear plant

https://www.france24.com/en/live-news/20260122-japan-suspends-restart-of-world-s-biggest-nuclear-...
1•geox•5m ago•0 comments

Show HN: PTA Tax Calculator – Calculate Pakistan mobile import taxes instantly

https://ptataxcalculator.org
1•yaoluxing•5m ago•0 comments

A Travel planning tool for foreigners visiting China

https://www.chinatravelroute.com
1•gwt123•5m ago•1 comments

Gemini Nano in Production: 41% Eligibility, 6x Slower, $0 Cost

http://sendcheckit.com/blog/ai-powered-subject-line-alternatives
1•michaelbuckbee•5m ago•0 comments

Avoiding Failure Is Not an Achievement

https://21yylideri.medium.com/avoiding-failure-is-not-an-achievement-f7015a3a2c58
1•volkanvardar•5m ago•0 comments

The 13

https://the-13.net/
1•brywag•6m ago•0 comments

Building a Real-Time HN Display for $15

https://medium.com/@lee.harding/building-a-real-time-hn-display-for-15-3ea1772051ff
2•kylegalbraith•7m ago•0 comments

Microsoft updates Notepad and Paint with more AI features

https://www.bleepingcomputer.com/news/microsoft/microsoft-updates-notepad-and-paint-with-more-ai-...
1•speckx•8m ago•0 comments

Tell HN: GitHub has experienced issues 60% of days this year

3•petetnt•9m ago•1 comments

Show HN: Doclific – ERD's, Architecture Diagrams, and Snippets, All in Your Repo

https://www.doclific.com
1•luker123•10m ago•0 comments

Tree-sitter vs. Language Servers

https://lambdaland.org/posts/2026-01-21_tree-sitter_vs_lsp/
1•ashton314•12m ago•0 comments

FOSS for Digital Sovereignty in the EU

https://www.more-magic.net/posts/open-source-in-the-eu.html
1•birdculture•13m ago•0 comments

AMD ROCm 7.2 Released

https://www.phoronix.com/news/AMD-ROCm-7.2-Released
1•mindcrime•13m ago•0 comments

Japan suspends largest nuclear plant hours after restart

https://www.bbc.com/news/articles/cx2yy8z91n4o
1•voxadam•14m ago•0 comments

American Decay versus American Dynamism

https://www.economist.com/finance-and-economics/2026/01/21/american-decay-versus-american-dynamism
2•andsoitis•14m ago•0 comments

Show HN: I built a JSON viewer that decodes Base64 media inline

https://viewjson.net
3•dassh•14m ago•0 comments

Speeding up Codebase adoption via new visualization techniques

https://ast-visualizer.com
3•codeviewer•20m ago•2 comments

Should corporate executives be criminally prosecuted for their misdeeds? (2019)

https://www.newstatesman.com/politics/2019/06/should-corporate-executives-be-criminally-prosecute...
1•robtherobber•20m ago•0 comments

Show HN: Codify – From Casual Browser Actions to Reusable Automation Tools

https://codify.codey.eu.org
1•cybrefly•20m ago•0 comments

China Wins as Trump Cedes Leadership of the Global Economy

https://www.nytimes.com/2026/01/22/business/davos-trump-xi-china.html
6•duxup•21m ago•1 comments

Qwen launched new open source TTS models

https://huggingface.co/collections/Qwen/qwen3-tts
2•binsquare•21m ago•0 comments

B-2 Spirit – a browser-based tactical bomber game

https://makari.world/games/B2Spirit.html
1•Emmy4life•21m ago•2 comments

Ask HN: How do you authorize AI agent actions in production?

2•naolbeyene•21m ago•1 comments

Sending Patches via Email with Git

https://en.andros.dev/blog/75beece9/sending-patches-via-email-with-git/
1•andros•22m ago•0 comments