frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Roger Fenton's Valley of the Shadow of Death (1855)

https://publicdomainreview.org/collection/roger-fenton-valley-of-the-shadow-of-death/
1•speckx•54s ago•0 comments

Apple Studio Display XDR Now FDA-Cleared for Diagnostic Radiology Use

https://www.macrumors.com/2026/04/07/studio-display-xdr-fda-clearance/
1•tosh•1m ago•0 comments

Spacebot: Agentic AI system where LLM process has a dedicated role, OpenClaw alt

https://spacebot.sh/
1•maxloh•1m ago•0 comments

Hotcopy Coding CLI with no context ceiling and agents that learn across sessions

https://hotcopy.ai/
2•antoniomadams•3m ago•0 comments

China is winning one AI race, the US another – but either might pull ahead

https://www.bbc.com/news/articles/c145enxln0go
1•devonnull•3m ago•0 comments

Building a framework-agnostic Ruby gem (and making sure it doesn't break)

https://newsletter.masilotti.com/p/on-building-a-framework-agnostic
1•joemasilotti•4m ago•0 comments

Another Memory Corruption Case

https://trofi.github.io/posts/347-another-memory-corruption-case.html
1•speckx•4m ago•0 comments

NRR doesn't have to compress as you scale (data from 37 devtools)

https://twitter.com/evilmartians/status/2041228426712101311
1•camimirabal•5m ago•0 comments

Show HN: A VS Code extension that points tickets based on tech debt

https://marketplace.visualstudio.com/items?itemName=BrooksForsyth.storypointinator
1•bforsyth•5m ago•0 comments

We fix your broken Rhino models

https://arcol.io/blog/how-we-fix-your-broken-rhino-models
1•skoodge•6m ago•0 comments

Tailslayer: Library for reducing tail latency in RAM reads

https://github.com/LaurieWired/tailslayer
1•hasheddan•6m ago•0 comments

Wireless festival cancelled after Kanye West banned from entering UK

https://www.theguardian.com/music/2026/apr/07/home-office-bans-kanye-west-from-entering-uk-wirele...
1•thinkingemote•8m ago•0 comments

RAM Has a Design Flaw from 1966. I Bypassed It [video]

https://www.youtube.com/watch?v=KKbgulTp3FE
1•surprisetalk•8m ago•0 comments

Show HN: A Little Excursion

https://alittleexcursion.com/
1•zzzzzzzzzzzxc•9m ago•0 comments

US Labor Force Participation Continues to Slide

https://restaurant.org/research-and-media/research/restaurant-economic-insights/analysis-commenta...
2•toomuchtodo•10m ago•0 comments

Self-hosted microservice that decodes minified stack traces

https://github.com/amadevstudio/source_dese
1•kinton•11m ago•0 comments

Tax Logic Evaluation with Prolog

https://github.com/mthom/scryer-prolog/discussions/3287
1•triska•11m ago•0 comments

Show HN: Whirligig

https://whirligig.live
1•idiocache•17m ago•1 comments

Show HN: Onepilot – Multi-agent orchestration from iPhone over native SSH

https://onepilotapp.com
4•elmlabs•17m ago•0 comments

Docker and the Linux Kernel Isolate Your Agent, and Where They Don't

https://timbreai.substack.com/p/how-docker-and-the-linux-kernel-isolate
2•bakibab•17m ago•1 comments

College instructor turns to typewriters to curb AI-written work

https://apnews.com/article/typewriter-ai-cheating-chatgpt-cornell-ce10e1ca0f10c96f79b7d988bb56448b
2•ohjeez•18m ago•0 comments

AI modeling techniques for vision-based occupancy determination

https://patents.google.com/patent/US20240185445A1/en
1•libpcap•18m ago•0 comments

The Seed Beneath the Snow

https://eli.li/the-seed-beneath-the-snow
1•birdculture•21m ago•0 comments

Cognition Announces SWE 1.6

https://cognition.ai/blog/swe-1-6
1•mschrage•23m ago•0 comments

Edwin Heathcote: Why the Max Bill Automatic Is the Ultimate Watch

https://www.ft.com/content/37cc95c7-794a-4d44-aef4-beebc534100f
1•aanet•23m ago•1 comments

Demis Hassabis: Renaissance man. He knows that image is everything

https://unherd.com/2026/04/demis-hassabis-renaissance-man/
2•mellosouls•24m ago•0 comments

Skilled older workers turn to AI training to stay afloat

https://www.theguardian.com/technology/ng-interactive/2026/apr/07/ai-training-work-jobs
4•billybuckwheat•25m ago•0 comments

is-antibot: Detect bot protection challenges from +25 providers

https://antibot.microlink.io/
2•Kikobeats•27m ago•0 comments

If-pal: play Zork (and other games) with an AI gen alpha companion

https://github.com/techbelly/if-pal
1•techbelly•27m ago•0 comments

OPFS: Origin Private File System

https://web.dev/articles/origin-private-file-system
1•nvahalik•29m ago•0 comments