frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Any Feedback?

1•CSP_LIBRARY•1m ago•0 comments

Stop Telling Professionals How to Do Their Job – Commander's Intent

https://age-of-product.com/commanders-intent/
1•swolpers•2m ago•0 comments

Anyone want to try automating SEO blog and TikTok content?

1•oyaa52•21m ago•0 comments

Common Sense 2026: AI In America – the open letter I dictated over 2 months

https://www.raresignal.ai/common-sense-2026
1•dl92•23m ago•0 comments

Kids Toys, Adult Issues

https://shampoooty.myportfolio.com/port
1•Kaibeezy•27m ago•0 comments

What if the browser built the UI for you?

https://jonno.nz/posts/what-if-your-browser-built-the-ui-for-you/
2•jonnonz•31m ago•0 comments

Why aren't qualified candidates getting hired?

https://justapplied.substack.com/p/the-job-market-is-broken-were-trying
1•pavankalyanm•32m ago•0 comments

US forces locate and evacuate downed airman in Iran

https://www.foxnews.com/live-news/second-us-fighter-jet-downed-by-strait-of-hormuz-as-search-for-...
3•longislandguido•37m ago•3 comments

Bevel Health CEO: WHOOP just filed a lawsuit against us

https://twitter.com/greyngyen/status/2040100105336799659
1•doppp•41m ago•0 comments

Limiting Not Just Screen Time, but Screen Space

https://www.noemamag.com/limiting-not-just-screen-time-but-screen-space/
1•ecliptik•54m ago•0 comments

AI that copied musical artist files copyright claim against that artist

https://twitter.com/VladTheInflator/status/2039577001531768906
25•josephcsible•55m ago•4 comments

Pre-Critical Recursive Cutoff: A Boundary Condition for AI Irreversibility

https://zenodo.org/records/18824181
1•EliasArden•1h ago•0 comments

The Spaceballs sequel will be released in April next year

https://www.engadget.com/entertainment/tv-movies/the-spaceballs-sequel-will-be-released-in-april-...
2•WaitWaitWha•1h ago•0 comments

Don't Buy the DGX Spark: NVFP4 Still Missing After 6 Months

https://old.reddit.com/r/LocalLLaMA/comments/1scf1x8/dont_buy_the_dgx_spark_nvfp4_still_missing_a...
3•kristianp•1h ago•0 comments

Anyone else having a terrible experience with Claude Code's remote control?

2•leddo•1h ago•0 comments

Anthropic blocks OpenClaw from Claude subscriptions in cost crackdown

https://thenextweb.com/news/anthropic-openclaw-claude-subscription-ban-cost
1•WaitWaitWha•1h ago•1 comments

Live Artemis II position tracker

https://issinfo.net/artemis
1•qingcharles•1h ago•1 comments

Quantum computers need fewer resources than thought to break vital encryption

https://arstechnica.com/security/2026/03/new-quantum-computing-advances-heighten-threat-to-ellipt...
3•rickcarlino•1h ago•1 comments

China's "pig semen eyedrop" could help deliver Alzheimer's treatment

https://www.scmp.com/news/china/science/article/3348726/chinas-brain-penetrating-pig-semen-eyedro...
4•nikolay•1h ago•1 comments

Remember Their Names

https://visualizingpalestine.org/visual/end-30-billion-of-us-military-aid-to-israel-green-jobs/
4•euler2100•1h ago•0 comments

Web server ratelimits are a precaution to let me stop worrying

https://utcc.utoronto.ca/~cks/space/blog/web/RatelimitsAreAPrecaution
1•LorenDB•1h ago•0 comments

After Fighting Malware for Decades, Cybersecurity Vet Now Hacking Drones

https://techcrunch.com/2026/04/04/after-fighting-malware-for-decades-this-cybersecurity-veteran-i...
1•yesensm•1h ago•0 comments

How Pope Leo is pushing back on divine justification of war

https://www.cnn.com/2026/04/04/middleeast/pope-leo-iran-war-analysis-latam-intl
2•1659447091•1h ago•0 comments

AGI Is Here

https://breaking-changes.blog/agi-is-here/
7•oakhan3•1h ago•26 comments

Show HN: Yoink functionality from dependencies and avoid supply chain attacks

https://github.com/theogbrand/yoink
2•kstonekuan•1h ago•0 comments

Half of social-science studies fail replication test in years-long project

https://www.nature.com/articles/d41586-026-00955-5
1•prabal97•1h ago•0 comments

The Rise of Worse Is Better

https://dreamsongs.com/RiseOfWorseIsBetter.html
1•kaladin-jasnah•1h ago•0 comments

Show HN: mailtrim – find what's actually filling your Gmail inbox

24•chevuru•1h ago•13 comments

Explore union types in C# 15

https://devblogs.microsoft.com/dotnet/csharp-15-union-types/
4•0x00C0FFEE•1h ago•0 comments

AI Whiz Kids Dropped Out of College and Got Investors to Pay Their Bills

https://www.wsj.com/tech/ai/ai-college-dropouts-ecc665b7
2•lxm•1h ago•0 comments