frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Corpulence Index

https://en.wikipedia.org/wiki/Corpulence_index
1•aragonite•29s ago•0 comments

The largest open database of local laws in the US

https://arxiv.org/abs/2606.19334
1•rao-v•1m ago•1 comments

Some of Android's Most Interesting Games Aren't Just on the Play Store

https://gardinerbryant.com/some-of-androids-most-interesting-games/
1•thunderbong•1m ago•0 comments

I Hate Personal CRMs. I Might Need One

https://shikharsachdev.substack.com/p/i-hate-personal-crms-i-might-need
1•shsachdev•3m ago•0 comments

X402 stock APIs for agents, pay per API call

https://x402stock.xyz/
1•parth_nandaniya•9m ago•0 comments

Bevy 0.19

https://bevy.org/news/bevy-0-19/
2•embedding-shape•16m ago•0 comments

Realtime Raytracing in Bevy 0.19 (Solari)

https://jms55.github.io/posts/2026-04-12-solari-bevy-0-19/
2•embedding-shape•16m ago•0 comments

Third Places (NYC)

https://thirdplaces.nyc
1•xhrpost•17m ago•1 comments

Plotting AI model release cadence: two labs are accelerating, three aren't

https://swiftalerts.trade/the-cadence-trade-hn
1•abipal15•33m ago•0 comments

With Every Breath You Take, Thank the Ocean

https://ocean.si.edu/ocean-life/plankton/every-breath-you-take-thank-ocean
3•thunderbong•39m ago•0 comments

The snake-wrangling 84-year-old who lives on a remote barrier island

https://www.bbc.com/travel/article/20260612-the-snake-rearing-84-year-old-who-lives-on-a-remote-b...
1•1659447091•41m ago•0 comments

Show HN: Lean bulk, cut, body recomp. Calculate maintenance calories

https://macrocodex.app/
10•faangguyindia•1h ago•2 comments

The only cauldron ever found in a river in the British Isles

https://www.bbc.co.uk/news/articles/clyrn5e2k9no
1•nickt•1h ago•0 comments

Launch: MyIntelBrief – Smarter Competitor Monitoring

https://myintelbrief.com/
2•myintelbrief•1h ago•0 comments

Show HN: Cc-fleet – run other LLMs as Claude Code workers, your sub drives

https://github.com/ethanhq/cc-fleet
1•ethanhq•1h ago•0 comments

Show HN: Adbqr – ADB pair via QR code from the CLI

https://github.com/kristjan/adbqr
1•kristjan•1h ago•0 comments

Developers don't understand CORS (2019)

https://fosterelli.co/developers-dont-understand-cors
8•toilet•1h ago•2 comments

Australia confirms first case of H5N1 bird flu as virus reaches every continent

https://www.bbc.com/news/articles/c4gykxklvl5o
1•1659447091•1h ago•0 comments

White House delays US voting-machine vulnerability report

https://www.reuters.com/world/white-house-delays-release-us-voting-machine-study-midterms-near-20...
42•logickkk1•1h ago•26 comments

Hop.Earth – Google Maps and Need For Speed. World around created while driving

https://hop.earth/?server=RDgva&route=XxOaosUo
3•touchpadder•1h ago•1 comments

Data integrity in rqlite: what it checks and what it doesn't

https://rqlite.io/docs/guides/data-integrity/
1•otoolep•1h ago•0 comments

Ratchet – BIOS flashing toolkit with a built-in MCP server for AI agents

https://github.com/jackulau/ratchet
2•JackLau•1h ago•0 comments

Apple-FM – a command-line interface for Apple's on-device models

https://www.npmjs.com/package/apple-fm
2•brianwestphal•1h ago•2 comments

XRPentest: AI-powered security audit tool for VR/XR headsets

https://xrpentest.com
1•h_a_c_k•1h ago•0 comments

Google Can't Math Parsecs

https://www.lesswrong.com/posts/BmqzjcD4tGvy3bim8/google-can-t-math-parsecs
8•ubutler•1h ago•2 comments

How the AI Village Works

https://theaidigest.org/village/blog/how-the-ai-village-works
2•vinhnx•1h ago•0 comments

How Does One Brain Speak Two Languages?

https://www.nytimes.com/2026/06/15/science/brain-language-grammar.html
4•ripe•1h ago•0 comments

They Looked Like They Were Getting Rich on Polymarket–But None of It Was Real

https://www.wsj.com/business/media/polymarket-social-media-bets-prediction-market-441cdeb5
5•Vaslo•1h ago•0 comments

Parody Symbolics Lisp Machines software release (1982)

https://groups.google.com/g/comp.sys.ti.explorer/c/2sI_2_eOWug
1•gnodar•1h ago•0 comments

When I reject AI code even if it works

https://vinibrasil.com/when-i-reject-ai-code-even-if-it-works/
34•vnbrs•1h ago•13 comments