frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Reliable Software in the LLM Era

https://quint-lang.org/posts/llm_era
1•mempirate•12s ago•0 comments

Pray Focus: I built an app that locks distracting apps until you finish praying

https://www.prayfocus.app/en
1•marijan_div•31s ago•1 comments

Most read-later apps are beautifully organized failure

1•northerndev•6m ago•1 comments

Someone just open sourced the OS for running company with zero employees

https://github.com/onera-app/onera-operator
1•shreyaspapi•7m ago•0 comments

Everyone's Worried About Taiwan. The Real Vulnerability Is in Wales

https://medium.com/@tbelbek/everyones-worried-about-taiwan-the-real-vulnerability-is-already-in-n...
1•rdstrtwlkr•11m ago•0 comments

Dear parents, social media are yesterday's battle

https://mfioretti.substack.com/p/dear-parents-social-media-are-yesterdays
1•pabs3•14m ago•0 comments

Wrong Ban?

https://leaflessca.wordpress.com/2026/02/09/wrong-ban/
1•pabs3•15m ago•0 comments

My identity was stolen and someone is using it to catfish men – it's terrifying

https://www.bbc.co.uk/news/articles/c89kdn3e185o
1•dijksterhuis•15m ago•0 comments

The Download: Pokémon Go to train world models, and the US-China race to find a

https://www.technologyreview.com/2026/03/11/1134174/the-download-pokemon-go-train-world-models-us...
1•joozio•16m ago•0 comments

Show HN: Guardio – control your AI Agent

https://github.com/radoslaw-sz/guardio
1•radoslaw-sz•17m ago•0 comments

America and Israel built military targeting machines: Software

https://www.economist.com/international/2026/03/11/how-america-and-israel-built-vast-military-tar...
1•supernikita•17m ago•1 comments

Physicality: The New Age of UI

https://www.lux.camera/physicality-the-new-age-of-ui/
1•tosh•24m ago•0 comments

Canadian Wind Farms

https://tech.marksblogg.com/canadian-wind-farms.html
1•marklit•26m ago•0 comments

Iran's Sea Mines Are One of Its Most Powerful Weapons

https://www.wsj.com/world/middle-east/iran-sea-mines-strait-of-hormuz-85e623b7
1•sorentwo•29m ago•0 comments

LipoJaro Review 2026: The Truth Behind the "Gelatin Trick"

https://www.facebook.com/LipoJaro.Fat.Burn
2•tayzjaik•29m ago•1 comments

Iran war oil shock accelerates Southeast Asia's EV revolution

https://www.scmp.com/week-asia/lifestyle-culture/article/3345751/iran-war-oil-shock-accelerates-s...
1•KnuthIsGod•29m ago•0 comments

Show HN: AI-powered one-click translator for Pokémon GBA ROM hacks

https://github.com/Olcmyk/Meowth-GBA-Translator
4•booffa•34m ago•2 comments

How long till every major provider sets their RSI loops in motion?

1•foxindustrial•37m ago•0 comments

GSD for Claude Code: A Deep Dive into the Workflow System

https://www.codecentric.de/en/knowledge-hub/blog/the-anatomy-of-claude-code-workflows-turning-sla...
1•kiyanwang•38m ago•0 comments

WordPress debuts a private workspace that runs in the browser

https://techcrunch.com/2026/03/11/wordpress-debuts-a-private-workspace-that-runs-in-your-browser-...
1•taubek•39m ago•0 comments

Show HN: Okapi yet Another Observability Thing

https://github.com/okapi-core/okapi
1•kushal2048•39m ago•0 comments

A Practical, Structured Guide That Delivers Confidence for the CCNA

1•Dexter7711•41m ago•0 comments

Show HN: Landlook – Interactive Landlock Profiler

https://github.com/cnaize/landlook
1•cnaize•42m ago•1 comments

Ask HN: Can a word game work as a competitive strategy esport?

1•itchymitchy•42m ago•0 comments

Behold The Power of std::meta::substitute

https://brevzin.github.io/c++/2026/03/02/power-of-substitute/
1•HeliumHydride•43m ago•0 comments

Decode Messenger

https://decodemessenger.lovable.app
1•genx__•43m ago•0 comments

Edition #6

https://forgeintelligence.substack.com/p/forge-intelligence-edition-6
1•beakmull•48m ago•0 comments

I Vibe Coded the Metaverse in a Week. Now What?

https://medium.com/meta-verses/i-vibe-coded-the-metaverse-in-a-week-d5a6b0579de6
3•mpesce•48m ago•1 comments

Anthropic seeks appeals court stay of Pentagon supply-chain risk designation

https://www.reuters.com/technology/anthropic-seeks-court-stay-pentagon-supply-chain-risk-designat...
3•SilverElfin•50m ago•0 comments

Dutch ISP forwarded customers' personal data to American AI company for years

https://nltimes.nl/2026/03/11/odido-routers-forwarded-customers-personal-data-american-ai-company...
2•sergdigon•54m ago•0 comments