frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: History Snacks – explore 9k historical events by date

https://historysnacks.io/
1•dmujeeb•3m ago•0 comments

Pentagon Formally Labels Anthropic Supply-Chain Risk

https://www.wsj.com/politics/national-security/pentagon-formally-labels-anthropic-supply-chain-ri...
1•klausa•4m ago•0 comments

SpaceX launches rockets with Excel. Here's why we're trying to replace it

https://docs.synnaxlabs.com/blog/introducing-arc
2•embonilla•4m ago•1 comments

Show HN: DocMCP – Index any docs site locally, search it from Claude via MCP

1•pieeee•4m ago•0 comments

Section 230 Isn't the Problem: Debating the Law on the Majority Report

https://www.techdirt.com/2026/03/05/section-230-isnt-the-problem-debating-the-law-on-the-majority...
1•hn_acker•6m ago•0 comments

Let's Get Physical

https://m4iler.cloud/posts/lets-get-physical/
4•MBCook•7m ago•0 comments

How Iran is using cheap drones to cause chaos across the Middle East

https://www.bbc.co.uk/news/resources/idt-b3a272f0-3e10-4f95-9cd1-b34ab8ad033c
2•tartoran•9m ago•1 comments

What if it's World War III?

https://colinbeavan.substack.com/p/what-if-its-world-war-iii
2•ObiOnePierogi•10m ago•0 comments

ELife Fallout

https://nikomc.com/2026/03/05/elife-fallout/
1•mailyk•11m ago•0 comments

AI as the "New Air"

https://futurium.ec.europa.eu/en/apply-ai-alliance/posts/ai-new-air
2•dlidnl•12m ago•1 comments

GPT-5.4 Is the Best OpenAI Model for SRE That We've Seen on Our SRE Benchmark

https://twitter.com/LaurenceLiang1/status/2029633049906872705
1•larryll•14m ago•0 comments

Show HN: Arcane Agents – A visual control room for terminal AI agents

https://github.com/thomasrice/arcane-agents
1•damanamathos•14m ago•0 comments

Hormuz Is the Hidden Risk to the AI Economy

https://www.bloomberg.com/opinion/articles/2026-03-05/iran-war-hormuz-is-the-hidden-risk-to-the-a...
2•geox•16m ago•0 comments

Living the metascience dream (or nightmare) with AI for science

https://jessicahullman.substack.com/p/living-the-metascience-dream-or-nightmare
1•eamag•16m ago•0 comments

Entity component systems for beginners: learning Rust on easy-mode [video]

https://www.youtube.com/watch?v=PXEc-WCGFBQ
1•weinzierl•17m ago•0 comments

Show HN: ClickArmor – Countering ClickFix social engineering in browser

https://chromewebstore.google.com/detail/clickarmor/gbbiaedhdapkbfmjgpepebidjpiphgmm
2•ditm-security•17m ago•0 comments

Personalized fMRI models decode moment-to-moment chronic pain in fibromyalgia

https://medicalxpress.com/news/2026-03-personalized-fmri-decode-moment-chronic.html
1•PaulHoule•17m ago•0 comments

Show HN: Anima – Give your projects a soul (autonomous AI dev cycles)

https://github.com/saltbo/anima
1•saltbo•18m ago•1 comments

Trump fires Homeland Security Secretary Noem after criticism

https://apnews.com/article/trump-homeland-security-noem-mullin-38c583b3cef97b4ef60d84b8f8b5961a
8•Agreed3750•18m ago•0 comments

Bill in New York State Would Protect Lawyers from AI Competition

https://reason.com/2026/03/04/this-bill-in-new-york-state-would-protect-lawyers-from-ai-competition/
1•mhb•19m ago•0 comments

Nvidia stops production of chips intended for Chinese market

https://www.ft.com/content/47f1cf56-209f-46fb-a437-f769b9ccb2cb
1•pera•19m ago•0 comments

SQG (SQL to Code Generator) Now with Java Streams and List Type Support

https://sqg.dev/blog/java-streams-and-list-types/
1•uwemaurer•19m ago•0 comments

OURA Acquires Doublepoint to Expand AI-Driven Interaction Capabilities

https://ouraring.com/blog/oura-acquires-doublepoint/
1•yakkomajuri•20m ago•0 comments

Show HN: Canvo – AI agent with live canvas and Linux sandbox on Android

https://github.com/canvo-app/canvo
1•canvo-app•20m ago•0 comments

Amazon bugging out for anyone else?

1•throwaway743•20m ago•1 comments

Show HN: Direct to silicon DLinear AI accelerator on the Sky130 open-source node

https://github.com/Aperion-Technologies/Aperion-DLinear-ASIC-Core
1•NotJustBinary•21m ago•0 comments

Amazon Lightsail now offers OpenClaw, a private self-hosted AI assistant

https://aws.amazon.com/about-aws/whats-new/2026/03/amazon-lightsail-openclaw/
1•nateb2022•21m ago•0 comments

Kristi Noem Out as DHS Secretary

https://www.cbsnews.com/news/kristi-noem-out-as-secretary-of-homeland-security-markwayne-mullin/
3•cdrnsf•22m ago•0 comments

Palantir's Double Conflict of Interest in the War Against Iran

https://bylinetimes.com/2026/03/05/palantirs-double-conflict-of-interest-in-the-war-against-iran/
2•gravisultra•22m ago•0 comments

Google has shipped a CLI for Google Workspace

https://skills.sh/?q=googleworkspace
1•umangsehgal93•23m ago•0 comments