frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Translatorhub

https://translatorhub.org/
1•zidana•5m ago•0 comments

Show HN: ClaudeTerminal – A tabbed terminal manager for Claude Code

https://github.com/Mr8BitHK/claude-terminal
1•mr8bit•7m ago•0 comments

NeurIPS 2021 Papers (2021)

https://tanelp.github.io/neurips2021/
1•vinhnx•10m ago•0 comments

Office of Technology Assessment

https://en.wikipedia.org/wiki/Office_of_Technology_Assessment
1•softwaredoug•12m ago•0 comments

MidnightBSD Excludes Calif. From Desktop Use Due to Digital Age Assurance Act

https://ostechnix.com/midnightbsd-excludes-california-digital-age-assurance-act/
4•WaitWaitWha•14m ago•2 comments

OpenSandbox

https://github.com/alibaba/OpenSandbox
1•nileshtrivedi•16m ago•0 comments

Why Is Your Operating System Debugging Hackers for Free?

1•agarmte•16m ago•0 comments

Polymarket Iran Bets Hit $529M as New Wallets Draw Notice

https://www.bloomberg.com/news/articles/2026-02-28/polymarket-iran-bets-hit-529-million-as-new-wa...
1•petethomas•18m ago•0 comments

Show HN: Computer Agents – Agents that work while you sleep

https://computer-agents.com
2•janlucasandmann•18m ago•0 comments

Uplift Privileges on FreeBSD

https://vermaden.wordpress.com/2026/03/01/uplift-privileges-on-freebsd/
1•vermaden•18m ago•0 comments

Artichoke induces sweet taste (PubMed)

https://pubmed.ncbi.nlm.nih.gov/5084667/
1•valzevul•18m ago•0 comments

Edge – Generate structured evaluation criteria for any domain using a local LLM

https://github.com/EviAmarates/fresta-edge
1•TiagoSantos•29m ago•0 comments

Have you used Terragrunt in the past? Keen to hear your thoughts

https://techroom101.substack.com/p/terragrunt-what-it-solves-what-it
1•ahaydar•29m ago•0 comments

Two-way Discord bridge-autonomous Claude Code sessions(WebSocket+local queue)

https://github.com/AetherWave-Studio/autonomous-claude-code
1•Drew-Aetherwave•30m ago•1 comments

Token Anxiety

https://writing.nikunjk.com/p/token-anxiety
1•vinhnx•31m ago•0 comments

A State Government Tried to Regulate Linux; It Went How You'd Expect

https://www.youtube.com/watch?v=mQLdDR-hJpc
1•cable2600•36m ago•1 comments

I built AI agents that do the grunt work solo founders hate

2•Seleci•42m ago•0 comments

TorchLean: Formalizing Neural Networks in Lean

https://leandojo.org/torchlean.html
2•matt_d•43m ago•0 comments

Hackers Expose the Surveillance Stack Hiding Inside "Age Verification"

https://www.techdirt.com/2026/02/25/hackers-expose-the-massive-surveillance-stack-hiding-inside-y...
2•nobody9999•44m ago•1 comments

Japanese firm Space One plans to launch Kairos No.3 rocket on Sunday

https://www3.nhk.or.jp/nhkworld/en/news/20260301_01/
2•HardwareLust•47m ago•2 comments

Show HN: Sailor.ai – source-backed personalized outbound emails

https://trysailor.ai/
1•bill_waybird•47m ago•1 comments

Show HN: Brand Analytics for AI Search Engines (Beta)

https://explore.somantra.ai/dashboard/141d19d6-1ee7-4a25-81cf-411e6792e286/Australia
1•prasaar•48m ago•0 comments

Show HN: Parallax – Ansible Without Python

https://parallax.digitalxero.dev/
1•DjGilcrease•49m ago•0 comments

Skills.sh Ecosystem Dashboard

https://skills-dashboard.olshansky.info/
1•Olshansky•52m ago•0 comments

Show HN: A visual sitemap editor that forces you to design structure before UI

3•epic_ai•57m ago•2 comments

Show HN: Memctl v0.1.0 Open source shared persistent memory for AI coding agents

https://memctl.com
3•meszmate•58m ago•0 comments

HeadElf-Mvidia: Executive Intelligence Template

https://github.com/pauljbernard/HeadElf-MVIDIA
3•paulbernard•1h ago•2 comments

Agents are not thinking: Science of agent behavior

https://technoyoda.github.io/agent-science.html
3•chse_cake•1h ago•0 comments

Sam Altman Answers Questions on X.com About Pentagon Deal, Threats to Anthropic

https://news.slashdot.org/story/26/03/01/0233230/sam-altman-answers-questions-on-xcom-about-penta...
1•MilnerRoute•1h ago•0 comments

Church of the SubGenius

https://en.wikipedia.org/wiki/Church_of_the_SubGenius
2•thomassmith65•1h ago•0 comments