frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Lazyagent – a local TUI for watching what your coding agents are doing

https://github.com/chojs23/lazyagent
1•neozz•48s ago•0 comments

Ask HN: Have you found a fulfilling way to handle multi-line text with JSON?

1•seph-reed•2m ago•0 comments

Show HN: Claude Code Context – Auto-resume Claude Code sessions per Git branch

https://github.com/paterlinimatias/claude-cc
1•paterlinimatias•2m ago•0 comments

Ask HN: How do you find / validate niches for an online business?

1•shakermaker83•4m ago•0 comments

Shipping Fast Requires a High Degree of Trust

https://www.shayon.dev/post/2024/7/shipping-fast-requires-a-high-degree-of-trust/
1•shayonj•4m ago•0 comments

Why Your SaaS Offer Doesn't Convert (things you can fix now) [video]

https://www.youtube.com/watch?v=GORVmSCNkbA
1•riley-i•4m ago•0 comments

Show HN: Project Parliament – a multi-model workflow for choosing OSS ideas

https://github.com/hardstone1998/Project-Parliament
1•1395291968•5m ago•0 comments

Apple's UK age verification brings identity checks to the iPhone

https://proton.me/blog/apple-uk-age-verification-iphone
2•akyuu•5m ago•0 comments

Iran's Nuclear Program Has Survived, Posing Problem for U.S. Negotiators

https://www.wsj.com/world/middle-east/iran-uranium-stockpile-strategy-333bcc1e
2•ceejayoz•6m ago•0 comments

Kissinger 2 (a competitor of Unifont with 8×16 and 16×16 glyphs)

https://typedesign.replit.app/kissinger2.html
2•PiotrGrochowski•8m ago•1 comments

5,877 Messages Later: Lessons in Controlling Agents with Telegram

https://vita-reports.ham.xyz/s/6020b089f389
1•zackham•9m ago•0 comments

The Center Has a Bias

https://lucumr.pocoo.org/2026/4/11/the-center-has-a-bias/
1•doppp•10m ago•0 comments

Mugib – AI agents that work across every channel–chat, voice, web, and live data

https://mugib.com/
1•anaspro•10m ago•0 comments

Why IBM Turned to Microsoft for Basic

https://nemanjatrifunovic.substack.com/p/why-ibm-turned-to-microsoft-for-basic
1•whobre•11m ago•0 comments

Malicious Job Assessments

https://thecout.com/blog/flexibleferret/
1•taubek•11m ago•0 comments

Show HN: The cutest WhatsApp concierge for dog friendly travel

https://kaliconcierge.com/
1•BuleBule•17m ago•1 comments

Rust Coreutils v0.8.0: performance gains, WebAssembly support, online playground

https://github.com/uutils/coreutils/releases
1•maxloh•18m ago•0 comments

Show HN: Run AI coding agents in real sandboxes, not Git worktrees

https://superhq.ai/
2•harshdoesdev•18m ago•0 comments

The Future of Everything Is Lies, I Guess: Psychological Hazards

https://aphyr.com/posts/416-the-future-of-everything-is-lies-i-guess-psychological-hazards
2•aphyr•19m ago•0 comments

Show HN: Accessyo – CLI to Debug DNS, TCP, TLS and HTTP Issues

https://www.npmjs.com/package/accessyo
1•tmszcncl•23m ago•0 comments

X slashes aggregator payouts to boost original creators

https://www.nbcnews.com/tech/social-media/x-slashes-aggregator-payouts-boost-original-creators-rc...
2•ceejayoz•23m ago•1 comments

Error Translation in Go Services

https://rednafi.com/go/error-translation/
2•Brajeshwar•24m ago•0 comments

PSA Crypto: The P is for Portability

https://danielmangum.com/posts/psa-crypto-portability/
1•hasheddan•25m ago•0 comments

The Sad Decline of Trenchant Exec Who Stole and Sold Zero Days to Russian Buyer

https://www.zetter-zeroday.com/trenchant-exec-says-he-had-depression-money-troubles-when-he-decid...
1•badcryptobitch•26m ago•0 comments

Entangled Systems Reveal Reversible Information Exchange, Defining Flow of Time

https://quantumzeitgeist.com/entangled-systems-subtime-time-emergence/
2•bookofjoe•26m ago•0 comments

Ask HN: Can you cut off AI usage immediately?

1•markus_zhang•30m ago•2 comments

Researchers discover new type of cell that's seen only during pregnancy

https://www.livescience.com/health/reproductive-health/no-one-knows-what-they-are-researchers-dis...
1•gmays•32m ago•0 comments

Why Europe Has Underground Power Lines and America Doesn't [video]

https://www.youtube.com/watch?v=BYuYGxLmwK8
1•dataflow•32m ago•1 comments

MIT Radiation Laboratory

https://www.ll.mit.edu/about/history/mit-radiation-laboratory
2•stmw•32m ago•1 comments

Costasiella kuroshimae – Solar Powered animals, that do indirect photosynthesis

https://en.wikipedia.org/wiki/Costasiella_kuroshimae
1•vinnyglennon•36m ago•0 comments