frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Git read-tree: Carbon-Copy without Merge Hell

https://blog.zenosmosis.com/posts/5-git-read-tree/
1•rustic-indian•3m ago•1 comments

Id Software was Lazy – DOOM could have had PC Speaker Music

https://lenowo.org/viewtopic.php?t=45
1•minki_the_avali•6m ago•0 comments

Ask HN: Do you think you have your location services on?

1•jacquesm•7m ago•0 comments

Ivan Sutherland Sketchpad Demo 1963 [video]

https://www.youtube.com/watch?v=6orsmFndx_o
1•fs_software•12m ago•0 comments

AI Mathematical Olympiad – Progress Prize 3

https://www.kaggle.com/competitions/ai-mathematical-olympiad-progress-prize-3
1•kristianp•13m ago•0 comments

MADvent – A Math and Logic Advent Calendar for Your Kids

https://madvent.amithm.ca/about
1•amitpm•13m ago•1 comments

Noodl.ist

https://jetgirl.art/introducing-noodlist/
1•jetgirl•14m ago•0 comments

Agents need good developer experience too

https://modal.com/blog/agents-devex
1•birdculture•15m ago•0 comments

Richard Feldman, "New Ways to Roc" [video]

https://www.youtube.com/watch?v=VnPw9rk8FI8
1•stephdin•15m ago•0 comments

Show HN: A "what-if" budget planner app born from new-parent chaos

https://planstheapp.com
1•riario•16m ago•0 comments

Helping Agents Debug Webapps

https://blog.fsck.com/2025/12/02/helping-agents-debug-webapps/
1•Ch00k•18m ago•0 comments

Oracle Credit Fear Gauge Hits Highest Since 2009 on AI Bubble Fears

https://www.bloomberg.com/news/articles/2025-12-02/oracle-credit-fear-gauge-hits-highest-since-20...
1•petethomas•19m ago•0 comments

Honduran ex-president released from US prison after Trump pardon

https://www.bbc.com/news/articles/cpvdr8k7xjro
2•wslh•24m ago•0 comments

H-1B to Plan B: India's top tech talent looks beyond the U.S.

https://restofworld.org/2025/india-tech-talent-diversifies-beyond-us/
1•nanfinitum•27m ago•0 comments

Comparison of Waymo Rider-Only crash rates by crash type to human benchmarks

https://www.tandfonline.com/doi/full/10.1080/15389588.2025.2499887
1•agnosticmantis•30m ago•0 comments

Rebinding for Observer-Safe Information Design

https://rebinding.is/
1•isaacbowen•34m ago•0 comments

Ask HN: Which web browser are you using and why?

4•throwaway81998•35m ago•5 comments

Claude the albino alligator in Cal Academy passed away at age 30

https://www.calacademy.org/press/releases/claude-the-albino-alligator-passes-away-at-age-30
2•elinear•37m ago•0 comments

Claude Died

https://abc7news.com/post/cal-academy-announces-beloved-claude-albino-alligator-has-died-30/18241...
12•jumploops•41m ago•3 comments

Cloth Simulation

https://cloth.mikail-khan.com/
1•adamch•41m ago•0 comments

Show HN: Build CLI apps with Ink that run in the browser

https://www.ink-web.dev/
2•thoughtfulchris•43m ago•0 comments

Time Became Money: Clocks, Capitalism and Wealth

https://www.ft.com/content/696d9ced-9022-49bc-9cd9-501fce897146
1•skx001•46m ago•0 comments

The FY26 NDAA: The Critical Power Pivot in Strategy, Silicon, and Steel

https://nerdrums.com/inside-the-fy26-ndaa/
1•Justin_N•48m ago•1 comments

LST-1 follow-up of the exceptionally bright gamma-ray burst GRB 221009A

https://arxiv.org/abs/2512.01747
1•belter•48m ago•0 comments

The Most Common Signs of a Heart Attack

https://www.nytimes.com/2025/10/10/well/common-signs-heart-attack.html
1•brandonb•50m ago•1 comments

Waymo hits a dog in San Francisco, reigniting safety debate

https://www.latimes.com/business/story/2025-12-02/waymo-strikes-dog-in-san-francisco-weeks-after-...
3•petethomas•51m ago•1 comments

Vintage Computing Christmas Challenge

https://logiker.com/Vintage-Computing-Christmas-Challenge-2025
2•varjag•51m ago•0 comments

Exploring Large HTML Documents on the Web

https://calendar.perfplanet.com/2025/exploring-large-html-documents-on-the-web/
8•zdw•53m ago•0 comments

Academic society bans Larry Summers for life over close ties to Jeffrey Epstein

https://apnews.com/article/larry-summers-harvard-jeffrey-epstein-bebad1142f859c15467ed010c08ea6fa
1•bikenaga•55m ago•0 comments

Windows 11 October 2025 Update Triggers Major Gaming Performance Regression

https://www.guru3d.com/story/windows-11-kb5066835-update-triggers-major-gaming-performance-regres...
7•d3Xt3r•57m ago•3 comments