frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Notionfile – Notion templates designed to reduce founder friction

https://notionfile.com
1•tuhel•1m ago•0 comments

Time's up for Rolex? British watchmakers challenge Swiss dominance

https://www.thetimes.com/world/middle-east/article/britain-watch-manufacturing-dubai-bjx06jdmc
1•petethomas•2m ago•0 comments

Stacked Diffs with git rebase —onto

https://dineshpandiyan.com/blog/stacked-diffs-with-rebase-onto/
1•flexdinesh•3m ago•0 comments

Wispr Flow raised $81M for this. I open-sourced it: jarvis.ceo (free forever)

https://github.com/akshayaggarwal99/jarvis-ai-assistant
2•imaka•4m ago•2 comments

Carbon costs of different pathways for reducing fire hazard in the Sierra Nevada

https://esajournals.onlinelibrary.wiley.com/doi/10.1002/eap.70111
1•PaulHoule•4m ago•0 comments

Fancy lion's mane in your latte? The rise of mushroom coffee

https://www.thetimes.com/life-style/food-drink/article/mushrooms-collagen-functional-coffee-g8t03...
1•petethomas•5m ago•0 comments

The lab where Carlsberg is brewing beers of the future

https://www.thetimes.com/life-style/food-drink/article/inside-lab-carlsberg-beers-future-7kzgsfvzz
1•petethomas•7m ago•0 comments

Dookie.nvim: A color scheme inspired by Plan9's acme editor

https://github.com/pebeto/dookie.nvim
1•joseesparza•8m ago•0 comments

Google Antigravity just deleted the contents of whole drive

https://old.reddit.com/r/google_antigravity/comments/1p82or6/google_antigravity_just_deleted_the_...
1•tamnd•12m ago•0 comments

AI helps drive record $11.8B in Black Friday online spending

https://www.reuters.com/business/retail-consumer/us-consumers-spent-118-billion-black-friday-says...
1•TMWNN•26m ago•1 comments

Show HN: Two physics-based programming languages (WPE/TME and Crystalline)

https://github.com/Heimdall-Organization/DHawk-Labs
1•yodamonk1•29m ago•0 comments

Top Gun Traders: Stock Bets and Crypto Culture Take over the Military

https://www.wsj.com/finance/stock-trading-military-crypto-culture-75fb3c59
3•JumpCrisscross•31m ago•2 comments

Yout and RIAA Clash in Court over YouTube's Alleged Copyright Barriers

https://torrentfreak.com/yout-and-riaa-clash-in-court-over-youtubes-alleged-copyright-barriers/
3•gslin•31m ago•0 comments

Observation of neutron emission acoustic cavitation deuterated titanium powder

https://www.nature.com/articles/s41598-024-62055-6
2•throwaway1492•35m ago•1 comments

The potential existential threat of LLMs to online survey research

https://www.pnas.org/doi/10.1073/pnas.2518075122
1•apical_dendrite•35m ago•0 comments

Inside NY Time's Hoax Factory

https://twitter.com/DavidSacks/status/1995225152674533557
1•rmason•36m ago•1 comments

The Ex-President Whom Trump Plans to Pardon Flooded America with Cocaine

https://www.nytimes.com/2025/11/29/nyregion/honduras-hernandez-drug-trafficking.html
11•duxup•37m ago•1 comments

Men 'portrayed as either frightening or pathetic in film and TV'

https://www.telegraph.co.uk/news/2025/11/15/men-portrayed-as-either-frightening-or-pathetic-in-fi...
2•nreece•39m ago•0 comments

A Different Conversation with Nikhil Kamath [video]

https://www.youtube.com/watch?v=Rni7Fz7208c
1•twapi•43m ago•0 comments

AWS Interconnect – Multi-Cloud

https://aws.amazon.com/about-aws/whats-new/2025/11/preview-aws-interconnect-multicloud/
4•dabinat•44m ago•1 comments

Search tool that only returns content created before ChatGPT's public release

https://tegabrain.com/Slop-Evader
3•dmitrygr•45m ago•1 comments

Show HN: AWAS – An open standard for AI-readable web actions

https://github.com/TamTunnel/AWAS
1•pp10•45m ago•0 comments

Show HN: C++ order book matching engine (3.2M orders/SEC, ~320ns)

https://github.com/eelixir/mercury
2•tjwells•46m ago•1 comments

A teenager redrew the Alabama voting map – and it's now state law

https://www.theguardian.com/us-news/2025/nov/30/alabama-teenager-election-map-voting-rights
2•a_w•46m ago•1 comments

Why Spec-Driven Development Breaks at Scale (and How to Fix It) – Arcturus Labs

http://arcturus-labs.com/blog/2025/10/17/why-spec-driven-development-breaks-at-scale-and-how-to-f...
1•JnBrymn•54m ago•0 comments

Minute Dedicated Servers (2007, Archive)

https://web.archive.org/web/20070208054839/http://www.15minuteservers.com/Welcome.asp
1•rob•55m ago•1 comments

Pocket Spring Machine

https://szkimkoo.com/product-category/pocket-spring-machines/
2•kimkoo•1h ago•2 comments

Tragic Algebra of Stock-Based Compensation

https://michaeljburry.substack.com/p/foundations-the-tragic-algebra-of
1•lispybanana•1h ago•2 comments

Amoral Drift in AI Corporate Governance

https://harvardlawreview.org/print/vol-138/amoral-drift-in-ai-corporate-governance/
3•measurablefunc•1h ago•0 comments

Seeing Theory - A Visual Introduction to Probability and Statistics

https://seeing-theory.brown.edu/
1•Anon84•1h ago•0 comments