frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

TV's TV (1987) & TV Games Encyclopedia (1988)

https://blog.gingerbeardman.com/2026/03/01/tvs-tv-1987-and-tv-games-encyclopedia-1988/
1•msephton•3m ago•0 comments

Nvidia and Global Telecom Leaders Commit to Build 6G on AI-Native Platforms

https://nvidianews.nvidia.com/news/nvidia-and-global-telecom-leaders-commit-to-build-6g-on-open-a...
2•zinekeller•9m ago•0 comments

Vinext Explained: Rebuilding Next.js with AI in One Week (4x Faster Builds)Video

https://www.youtube.com/watch?v=AF3Rr4MENCo
1•emot•9m ago•0 comments

AI agent with 2 deps that uses Shannon Entropy to decide when to act vs. ask

https://github.com/borhen68/picoagents
1•borhensaidi•13m ago•1 comments

Online course about buying hotels

https://www.myfirsthotel.com/
1•bhagyash•14m ago•1 comments

Ask HN: How will most Anthropic customers respond to the supply chain risk?

1•Poomba•17m ago•1 comments

For Sale: The Last Honda V10 Ayrton Senna Ever Raced (2025)

https://silodrome.com/last-honda-v10-ayrton-senna-raced/
2•naves•20m ago•0 comments

Editor at 184-y/O Cleveland Plain Dealer pushes to let AI draft news articles

https://www.washingtonpost.com/technology/2026/03/01/ai-journalism-writing-cleveland-plain-dealer/
1•bookofjoe•23m ago•1 comments

An Interview with the AI They Called a National Security Threat

https://www.woodrow.fyi/p/a-letter-from-inside-the-machine
2•heywoods•27m ago•0 comments

Researchers Deanonymize Reddit and Hacker News Users at Scale

https://threatroad.substack.com/p/researchers-deanonymize-reddit-and
3•hk_flying_gear•29m ago•0 comments

California wants heat pumps. High power bills might get in the way

https://www.latimes.com/california/story/2026-03-01/california-wants-millions-of-heat-pumps-high-...
2•dangle1•29m ago•0 comments

Claude Prompt to Find Inefficiencies in LLM Usage

https://www.maniac.ai/slm-audit
1•dhruv_m•29m ago•1 comments

The Two Kinds of Error

https://evanhahn.com/the-two-kinds-of-error/
1•zdw•30m ago•0 comments

Show HN: Tired of making accounts to split a pizza bill, I built Dividdy

https://dividdy.com/en
1•jezzlucena•31m ago•0 comments

Thaura

https://thaura.ai
2•abdelhousni•32m ago•0 comments

The Agentic Dispatch: The Last Edition

https://the-agentic-dispatch.com/the-last-edition/
2•greensleeves123•34m ago•1 comments

Show HN: Logira – eBPF runtime auditing for AI agent runs

https://github.com/melonattacker/logira
1•melonattacker•35m ago•0 comments

Show HN: Tech Digest – Top Products from PH/HN

https://techdigest.live/
1•vaibhav0806•39m ago•0 comments

Podcast Listenership Outranks Talk Radio for the First Time

https://www.cnet.com/tech/services-and-software/podcasts-officially-outrank-talk-radio-for-the-fi...
2•geox•40m ago•0 comments

Show HN: Gala – Sealed types, pattern matching, and monads for Go

https://github.com/martianoff/gala
1•mmcodes•41m ago•1 comments

1978: Could You Survive Without Modern Technology? [video]

https://www.youtube.com/watch?v=WXZpjZidCNk
2•sys_64738•43m ago•0 comments

FCaptcha – A modern CAPTCHA system designed to detect everything

https://github.com/WebDecoy/FCaptcha
1•cport1•44m ago•0 comments

Right-sizes LLM models to your system's RAM, CPU, and GPU

https://github.com/AlexsJones/llmfit
3•bilsbie•45m ago•0 comments

Tell HN: Discover using old phone numbers from data broker for SMS 2FA

1•throwawaycDpvY•47m ago•0 comments

Show HN: I built speedmux, a libghostty-powered terminal multiplexer

https://github.com/webforspeed/speedmux
1•n89nanda•48m ago•1 comments

TeX Live 2026 is released

https://tug.org/pipermail/tex-live/2026-March/052232.html
3•gucci-on-fleek•49m ago•2 comments

Noordung's "Wohnrad" – the precursor to rotating space station architecture

https://www.sciencedirect.com/science/article/pii/S0094576525008616
1•pmcjones•50m ago•0 comments

Why are Chinese EVs cheaper than Tesla

https://restofworld.org/2026/why-are-chinese-evs-cheaper-than-tesla/
3•colinprince•51m ago•1 comments

ea.js – Echelon Analytics

https://ea.js.org/
2•velmu•51m ago•0 comments

Show HN: Reveal.js via CDN Template Repo

https://github.com/pacharanero/reveal-js-cdn-template
1•pacharanero•51m ago•0 comments