frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Check any website up/down Status?

https://urlwatch.io/
1•rajkverma123•2m ago•0 comments

The mathematical secrets of Barcelona's Sagrada Familia

https://mappingignorance.org/2026/06/30/sagrada-familia/
1•Gedxx•5m ago•0 comments

EU Parliament temporarily defies Chat Control

https://www.heise.de/en/news/Partial-victory-with-a-catch-EU-Parliament-temporarily-defies-chat-c...
2•donpott•5m ago•0 comments

SmrtLnks – Cheaper Bitly link shortener that routes by GEO, dynamic QR included

https://smrtlink.link/
1•ExcellentNobody•6m ago•0 comments

India asks WhatsApp to pause username feature rollout over fraud concerns

https://www.bbc.com/news/articles/ckg8e0n9l41o
1•Markoff•8m ago•0 comments

JPEG-XL Libjxl 0.12 Brings More Performance Optimizations

https://www.phoronix.com/news/JPEG-XL-libjxl-0.12
2•blurred•10m ago•0 comments

Multiple Linux tarballs return 404 on kernel.org

https://kernel.org/
1•Lwrless•16m ago•0 comments

Alibaba to ban Claude Code in workplace over alleged backdoor risks, source says

https://www.reuters.com/world/china/alibaba-ban-claude-code-workplace-over-alleged-backdoor-risks...
5•nsoonhui•16m ago•0 comments

Half-Baked Product

https://weli.dev/blog/half-baked-product/
2•weli•25m ago•1 comments

How eveRy webSite is tRacking you 24/7. SiTe STaMpS

https://medium.com/@thesuperrepemail/how-every-website-is-tracking-you-24-7-site-stamps-333e8026eaba
1•mssblogs•25m ago•0 comments

Giotto.ai: "A Swiss lab with European heart"

https://www.giotto.ai/#about
1•theanonymousone•26m ago•0 comments

ECTC 2026 Roundup, Intel, TSMC, SK Hynix, Samsung, Micron, Marvell, Lightmatter

https://newsletter.semianalysis.com/p/ectc2026
1•felixdoerp•27m ago•0 comments

Nobody Reads the SQL Anymore

https://tabularis.dev/blog/nobody-reads-the-sql-anymore
1•debba•27m ago•1 comments

'guix substitute' and 'guix pull' Vulnerabilities

https://guix.gnu.org/en/blog/2026/guix-substitute-pull-vulnerabilities/
1•elephant-ocean•27m ago•0 comments

I replaced my GitHub runners with Lambda MicroVMs, and maybe you should too

https://lucvandonkersgoed.com/2026/07/01/i-replaced-my-github-runners-with-lambda-microvms-and-ma...
1•touristtam•29m ago•1 comments

NVCF: Deploy and Route GPU-Accelerated AI Workloads at Scale

https://github.com/NVIDIA/nvcf
1•mastabadtomm•30m ago•0 comments

Amazon's Mechanical Turk to stop accepting new customers

https://www.theregister.com/off-prem/2026/07/03/amazons-mechanical-turk-to-stop-accepting-new-cus...
7•50kIters•35m ago•0 comments

Action Preflight: consequence-aware admission for LLM agent actions

https://github.com/gfernandf/agent-skills/blob/main/docs/ACTION_PREFLIGHT_FORECAST_QUICKSTART.md
1•gfernandf1•36m ago•2 comments

Exploring Nix for Enterprise Teams

https://medium.com/ekino-france/exploring-nix-for-enterprise-teams-2e61d755e473
1•tduyng•37m ago•0 comments

Global gridded population datasets underrepresent rural population (2025)

https://www.nature.com/articles/s41467-025-56906-7
2•bryanrasmussen•42m ago•1 comments

The Law of Leaky Abstractions (2002)

https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/
1•SmartHypercube•43m ago•0 comments

What Happened to the Fight for the Internet?

https://dustycloud.org/blog/what-happened-to-the-fight-for-the-internet/
1•birdculture•44m ago•1 comments

Show HN: TTS Model – Another attempt to cross the uncanny valley

https://theclevr.com
1•cyrus_ck•49m ago•0 comments

We sell digital assets built on AI-powered business models

https://digitvest.com/en
1•kilincarslan•50m ago•0 comments

Naval: Code is consumed by computers, writing by humans. So write it yourself

https://www.ssp.sh/brain/the-differences-between-writing-and-coding/
2•zazuke•52m ago•0 comments

Editorial: It's time to step up and have your say for science

https://arstechnica.com/science/2026/07/editorial-the-most-important-thing-you-can-do-to-protect-...
2•rbanffy•52m ago•1 comments

Argo Mission 1

https://www.argospace.com//news/argo-mission-1
1•da-x•53m ago•0 comments

'Vanishingly rare' copy of US Declaration of Independence found in UK archives

https://www.theguardian.com/uk-news/2026/jul/03/vanishingly-rare-copy-us-declaration-independence...
2•6LLvveMx2koXfwn•54m ago•0 comments

DConf '26 Schedule

https://dconf.org/2026/index.html#schedule
1•pjmlp•55m ago•0 comments

Capped Fable turns capability into budgeting problem

https://spark.temrel.com/p/fable-5-rationed
1•bentemrel•1h ago•0 comments