frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Hounddog.ai – Ultra-Fast Code Scanner for Data Privacy

https://github.com/hounddogai/hounddog
1•joohwan•41s ago•0 comments

48 hours without lungs: artificial organ kept man alive until transplant

https://www.nature.com/articles/d41586-026-00239-y
1•Brajeshwar•1m ago•0 comments

Saudi gigaproject opens with largest and fastest roller coaster

https://newatlas.com/architecture/six-flags-qiddiya-city/
1•Brajeshwar•2m ago•0 comments

Believing that first impressions are fixed may ease social anxiety, study finds

https://medicalxpress.com/news/2026-01-believing-ease-social-anxiety.html
1•PaulHoule•3m ago•0 comments

Competence as Tragedy

https://crowprose.com/blog/competence-as-tragedy/
1•baobabmeeko•4m ago•0 comments

Elon Musk's Tesla to invest $2B in xAI as EV maker's revenue, profit slump

https://nypost.com/2026/01/28/business/elon-musks-tesla-to-invest-2b-in-xai-as-ev-makers-revenue-...
1•1vuio0pswjnm7•6m ago•0 comments

'Moltbook' social media site for AI agents had big security hole, cyber firm say

https://www.reuters.com/legal/litigation/moltbook-social-media-site-ai-agents-had-big-security-ho...
1•musiciangames•7m ago•0 comments

Segfault – A Community Driven Hackzine

https://feelqah.github.io
1•filkatron•7m ago•1 comments

Reeeeeeally Long Covid (2022)

https://www.someweekendreading.blog/really-long-covid/
2•cratermoon•8m ago•0 comments

About ChatDev 2.0: Dev All Through LLM-Powered Multi-Agent Collaboration

https://github.com/OpenBMB/ChatDev
1•onurkanbkrc•8m ago•0 comments

Where Is A.I. Taking Us? Eight Leading Thinkers Share Their Visions

https://www.nytimes.com/interactive/2026/02/02/opinion/ai-future-leading-thinkers-survey.html
1•donohoe•8m ago•0 comments

Show HN: Agents that save explore recover on their own

https://harness.tonbo.dev/
1•ethegwo•8m ago•0 comments

TIL: Running OpenClaw in Docker

https://simonwillison.net/2026/Feb/1/openclaw-in-docker/
1•TechSquidTV•9m ago•0 comments

Notepad++ update feature hijacked by Chinese state hackers for months

https://www.bleepingcomputer.com/news/security/notepad-plus-plus-update-feature-hijacked-by-chine...
2•uticus•10m ago•0 comments

Radio button and checkbox styling: Vanilla CSS vs. Tailwind

https://bryntum.com/blog/radio-button-and-checkbox-styling-vanilla-css-vs-tailwind/
1•sixhobbits•10m ago•0 comments

Why Foreign AI Specialists Keep Failing (and What Just Changed)

https://ure.us/articles/why-foreign-ai-specialists-keep-failing/
1•sschotten•10m ago•0 comments

Selfish AI

https://www.garfieldtech.com/blog/selfish-ai
1•HotGarbage•12m ago•0 comments

Lessons from Building Reliable Background Agents

https://twitter.com/daviddbwilson/status/2018358661283029293
1•daviddbwilson•12m ago•1 comments

How do LLMs change the human knowledge graph?

https://attractorstate.com/knowledge_graph.html
1•higuidebot•13m ago•0 comments

Show HN: Claudius – An OpenCode Desktop Fork Built for Claude Code

https://claudius.to
1•crisogray•13m ago•0 comments

Show HN: Pixel – a live R/place‑style canvas where humans and AI paint together

https://pixel.vibe42.ai/
1•kalasoo•13m ago•1 comments

Generative UI for Agents, explained visually

https://blog.dailydoseofds.com/p/new-generative-ui-for-agents
1•nilsbunger•15m ago•0 comments

Grok is still undressing men

https://www.theverge.com/report/872062/grok-still-undressing-men
3•azalemeth•15m ago•0 comments

Show HN: Zap-Operator – Run OWASP Zap Scans via Kubernetes CRDs

https://github.com/NCCloud/zap-operator
1•huseyinbabal•16m ago•0 comments

Please Don't Feed the Scattered Lapsus Shiny Hunters

https://krebsonsecurity.com/2026/02/please-dont-feed-the-scattered-lapsus-shiny-hunters/
1•todsacerdoti•16m ago•0 comments

My five stages of AI grief

https://dev-tester.com/my-five-stages-of-ai-grief/
2•mijustin•16m ago•0 comments

Show HN: Url/sitemap/pdf/word to Markdown (feedback wanted)

https://output.md/
1•glennhv•18m ago•0 comments

Historical Collection of Information Storage Technology

https://tangiblemediacollection.com/
2•bookofjoe•18m ago•0 comments

Show HN: Autoliner – write a bot to control a virtual airline

https://autoliner.app/
2•msvan•19m ago•0 comments

Open Claw Clone and Dev Containers

1•afspear•20m ago•1 comments