frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Real world arcade racing game

https://routeplanner.app/us/games/drift/
1•velmu•19s ago•0 comments

Living subscription-free in my brain

https://mapwriting.substack.com/p/living-subscription-free-in-my-brain
1•doitLP•34s ago•0 comments

Exploiting LLM Agent Supply Chains via Payload-Less Skills

https://arxiv.org/abs/2605.14460
1•_pdp_•2m ago•0 comments

Show HN: Apple Foundation Model in Xcode-Beta

https://gist.github.com/voxels/b6ea737dd127745f9af009ebd038ded4
1•edgcumbe•4m ago•0 comments

Omegacode: Code based orchestration for any coding agent

https://github.com/Sawyerhood/omegacode
1•handfuloflight•8m ago•0 comments

3D Airplane tracker on Mercator map

https://github.com/jamalrfordii-arch/Vanguard-Map
1•Lawyer24•9m ago•0 comments

Show HN: Desunofier – Removing shimmer from Suno songs

https://www.instasong.co/tools/desunofier
1•stanyy•10m ago•0 comments

Invisible Cities: The Legal Analysis of Space Based Solar Power

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6881678&__cf_chl_f_tk=hLQWAnW2Nv4B5PaV4xX16mF...
1•Lawyer24•11m ago•0 comments

GPT-5.5 Codex reasoning-token clustering may be leading to degraded performance

https://github.com/openai/codex/issues/30364
3•maille•11m ago•1 comments

Australian influencer Lily Jay's tangled web of AI manipulation

https://www.abc.net.au/news/2026-07-05/lily-jay-foundation-posts-ai-generated-misleading-videos/1...
1•phs318u•12m ago•0 comments

Global scammers use US tech to fleece people

https://apnews.com/article/scams-fraud-technology-ai-impostor-scam-phishing-12f549d5203abd38857c4...
4•jethronethro•14m ago•0 comments

Edeward Neumeier talks about 'RoboCop'

http://www.money-into-light.com/2013/07/edward-neumeier-talks-about-robocop.html
1•firasd•15m ago•0 comments

Taking the Temperature of Black Holes(2019)

https://www.bbc.com/news/uk-scotland-47773553
1•rolph•19m ago•0 comments

Tea-dash – a gh-dash-style terminal dashboard for Gitea and Forgejo

https://github.com/gbarany/tea-dash
1•gbarany•20m ago•0 comments

Show HN: Markdown to PDF CLI Tool

https://github.com/leonardosalasd/doc-engine-cli
2•leonardosalasd•25m ago•1 comments

My AI-built PHP engine in Rust passes 17% of PHP-src tests, renders WordPress

https://ekinertac.com/blog/i-dont-know-rust-my-ai-is-rewriting-php-in-it/
1•ekinertac•27m ago•0 comments

GoDaddy argues India's fake site crackdown could damage internet

https://www.reuters.com/world/worlds-biggest-domain-seller-fears-indias-fake-site-crackdown-could...
4•1vuio0pswjnm7•31m ago•1 comments

Small, odd, fleeting moments in which a neighborhood briefly exceeds itself

https://www.neighborhood-stills.com/
1•alexandruboia•32m ago•0 comments

The tests are the code now

https://softwaredoug.com/blog/2026/03/10/the-tests-are-the-code-now
2•softwaredoug•34m ago•0 comments

Alibaba/page-agent: in-page GUI agent. Control web interfaces

https://github.com/alibaba/page-agent
1•jonnonz•34m ago•0 comments

I Went Looking for Dignity and found it here [video]

https://www.youtube.com/watch?v=4gFGFbctEe0
1•pshapiro99•36m ago•1 comments

How AI Became More Expensive Than the Workers It Replaced [video]

https://www.youtube.com/watch?v=cfaZZPjA3g0
2•Bender•39m ago•0 comments

Linux DRM Scheduler Patches Yield Improvement for Job Submission Latency

https://www.phoronix.com/news/DRM-Scheduler-Lower-Job-Submit
2•Bender•42m ago•0 comments

Don't Hang Up on AI Scammers. Do This Instead [video]

https://www.youtube.com/watch?v=lk3jCuITwcE
1•wisemanwillhear•42m ago•0 comments

Show HN: Mise – A keyboard-driven Python/Qt6 browser built for fanless laptops

https://github.com/Rakosn1cek/Mise
1•Rakosn1cek•46m ago•0 comments

Exclusive-Meta's Zuckerberg says AI agent tech progressing slower than expected

https://finance.yahoo.com/technology/ai/articles/exclusive-zuckerberg-says-ai-agent-201123441.html
1•_____k•46m ago•0 comments

Show HN: Sieze the means of production from our agentic overlords

https://github.com/Xophmeister/wean
2•Xophmeister•48m ago•0 comments

Show HN: I built an encrypted BLE dongle for pasting stuff to air-gapped devices

https://github.com/Brisk4t/ToothPaste
2•Brisk4t•49m ago•1 comments

Operation Ivy Bells

https://en.wikipedia.org/wiki/Operation_Ivy_Bells
3•m-hodges•50m ago•0 comments

Visualize how many files in a codebase you contributed

https://app.principal-ade.com/anomalyco/opencode
2•fernando-ram•54m ago•0 comments