frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Is DDD Overkill for My CRUD Project?

https://docs.eventsourcingdb.io/blog/2026/05/25/is-ddd-overkill-for-my-crud-project/
1•goloroden•1m ago•0 comments

IBM Confidential: System/360 File Organization [video]

https://www.youtube.com/watch?v=zokKqP0plrM
1•DaiPlusPlus•1m ago•0 comments

Design Token Drift: Audited 375 Sites. Only 7.5% Got It Right

https://overlayqa.com/blog/design-token-drift-study/
1•emvied•2m ago•0 comments

Take the Plain Challange

https://plainnews.app
1•anjrued•2m ago•1 comments

Cryptic Studio's Jack Emmert on Creating Lasting MMOs

https://www.gamesindustry.biz/people-want-mmos-and-the-sales-of-new-world-proved-it-cryptic-studi...
1•1123581321•3m ago•0 comments

You probably don't need extra electrolytes

https://www.economist.com/science-and-technology/2026/05/22/you-probably-dont-need-extra-electrol...
1•austinallegro•4m ago•0 comments

Introducing HRM-Text

https://sapient.inc/introducing-hrm-text/
1•aziis98•8m ago•0 comments

You probably don't need extra electrolytes

https://economist.com/science-and-technology/2026/05/22/you-probably-dont-need-extra-electrolytes
1•andsoitis•10m ago•0 comments

Authorization layer for AI agents (OAuth has no idea what your agent is doing)

https://www.tryagentgate.com/
2•ElamOlame•12m ago•0 comments

TrapDoor supply chain attack hits PyPI, NPM, and crates.io

https://socket.dev/blog/trapdoor-crypto-stealer-npm-pypi-crates
2•rvz•12m ago•0 comments

Local-Eye

https://localeye.co
1•rtsubber•15m ago•0 comments

AI is learning to fly airplanes – and aviation is starting to embrace it

https://www.cnn.com/2026/05/24/us/ai-flying-airplanes
1•reconnecting•17m ago•0 comments

In India, You Can Get Milk Delivered Faster Than It Takes to Make Coffee

https://www.wsj.com/business/logistics/in-india-you-can-get-milk-delivered-faster-than-it-takes-t...
2•ViktorRay•17m ago•0 comments

The Temporal Coherence of Music and Consciousness [video]

https://www.youtube.com/watch?v=_J2SyjBaToE
1•ersinesen•23m ago•0 comments

Show HN: Replacing a 3.4MB video with 40kb of GSAP

https://spanthi.com/blog/gsap-choreography/
1•vein05•24m ago•0 comments

I stay motivated as a solo-creator (2023)

https://herman.bearblog.dev/how-i-stay-motivated-as-a-solo-creator/
1•andsoitis•26m ago•0 comments

CEO Mark Zuckerberg's Senate Hearing

https://blog.acton.org/archives/101128-explainer-what-you-should-know-about-facebook-ceo-mark-zuc...
2•Caarticles•27m ago•0 comments

Monitoring and engaging in social media conversations during a crisis

https://www.tandfonline.com/doi/full/10.1080/23311975.2015.1084978
1•Caarticles•30m ago•0 comments

Private Members Area – Join Now

http://layana.meet2live.online
3•Charlotteer•31m ago•1 comments

Forward deployed engineering heats up again

https://blog.pragmaticengineer.com/the-pulse-forward-deployed-engineering-heats-up-again/
1•logickkk1•33m ago•0 comments

Beware of EU-Washing

https://blog.avas.space/eu-washing/
1•BrunoBernardino•34m ago•0 comments

How API Drift Silently Breaks Data Pipelines

https://medium.com/data-science-collective/how-api-drift-silently-breaks-data-pipelines-and-how-t...
2•mkhorasani•38m ago•0 comments

gopher://gopher.floodgap.com/

https://gopher.floodgap.com/gopher/gw
3•susam•40m ago•0 comments

Lens Aberrations Explained (2023)

https://phillipreeve.net/blog/lens-aberrations-explained-part-1/
1•sirpilade•41m ago•0 comments

Spy: language semantics for a statically-typed compiled variant of Python

https://antocuni.eu/2026/03/25/inside-spy-part-2-language-semantics/
1•fanf2•41m ago•0 comments

"Long-Term Support" doesn't mean what you think

https://pointieststick.com/2026/05/23/long-term-support-doesnt-mean-what-you-think/
1•naves•42m ago•0 comments

Layman with no degree directs AI agents to derive Newton's G to 1.86 ppm

https://github.com/oldrich-research/gravitational-constant-relation
2•Oldrich333•42m ago•1 comments

Bullwhip Effect

https://en.wikipedia.org/wiki/Bullwhip_effect
3•olalonde•42m ago•0 comments

The Booing Will Continue Until Commencement Speeches Improve

https://gizmodo.com/the-booing-will-continue-until-commencement-speeches-improve-2000762929
2•makerdiety•45m ago•0 comments

Show HN: Lightweight, OpenSource, zero-dependency App tour & user onboarding SDK

https://www.npmjs.com/package/@qocial/tour
1•qocialApp•45m ago•0 comments