frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Largest Digital Human Rights Conference Suddenly Canceled

https://www.404media.co/rightscon-human-rights-conference-suddenly-postponed/
1•Brajeshwar•1m ago•0 comments

Long-Running Agents

https://addyo.substack.com/p/long-running-agents
2•swolpers•3m ago•0 comments

Maximilian Schwarzmüller – GitHub is facing problems [video]

https://www.youtube.com/watch?v=pekbl3Yz02g
1•mindcrime•3m ago•0 comments

Constraints That Compute: A Unified Framework for Efficient Intelligence

https://zenodo.org/records/19895574
1•massimiliano_c•3m ago•0 comments

Dotcl: Common Lisp Implementation on .NET

https://github.com/dotcl/dotcl
1•reikonomusha•3m ago•0 comments

Illegal vs. Unwanted States

https://buttondown.com/hillelwayne/archive/illegal-vs-unwanted-states/
1•azhenley•5m ago•0 comments

SatoshiGuesser – Roll for Bitcoin

https://github.com/Pathos0925/SatoshiGuesser
3•ilarum•7m ago•0 comments

China pushes EU capitals to scrap 'Made in Europe' law or face retaliation

https://www.euronews.com/my-europe/2026/04/29/china-pushes-eu-capitals-to-scrap-made-in-europe-la...
2•Teever•11m ago•0 comments

A text editor as a user interface

https://ratfactor.com/cards/text-editor-as-ui
1•ibobev•12m ago•0 comments

Nvidia Nemotron 3 Nano Omni

https://huggingface.co/blog/nvidia/nemotron-3-nano-omni-multimodal-intelligence
1•ibobev•12m ago•0 comments

The Day I Logged 1 in Every 2000 Public IPv4: Visualizing the AI Scraper DDoS

https://vulpinecitrus.info/blog/one-in-every-2000-ipv4-visualizing-ddos-ai-web-scrapers/
1•birdculture•12m ago•0 comments

Ask HN: Instead of intrusive age-check why can't we have "two internet"?

2•kreco•12m ago•3 comments

AI evals are becoming the new compute bottleneck

https://huggingface.co/blog/evaleval/eval-costs-bottleneck
1•ibobev•13m ago•0 comments

Agentic User Research Tool

https://github.com/elpabl0/research-ai
1•elpabl0•14m ago•0 comments

You're probably taking the wrong painkiller

https://dynomight.net/painkillers/
1•ahlCVA•15m ago•0 comments

How to stop your agents from making the same mistakes

https://twitter.com/garrytan/status/2046876981711769720
1•gmays•15m ago•0 comments

Ask HN: Are github.com previews broken on Slack?

1•statico•15m ago•0 comments

The Rails Way in 2026

https://blog.arkency.com/the-rails-way-in-2026/
1•robotfelix•17m ago•0 comments

Raspberry Pi Connect may control Windows soon

https://www.jeffgeerling.com/blog/2026/raspberry-pi-connect-may-control-windows-soon/
2•Brajeshwar•17m ago•0 comments

How to run CubeSandbox on a cloud server?

https://github.com/TencentCloud/CubeSandbox/blob/master/README.md
2•Quentin_0101•18m ago•0 comments

CSS Easing Editor and Generator

https://easingwizard.com/
1•tilt•18m ago•0 comments

Bezier Customizer

https://courses.joshwcomeau.com/tools/bezier
1•tilt•18m ago•0 comments

Mayo Clinic AI detects pancreatic cancer up to 3 years before diagnosis

https://newsnetwork.mayoclinic.org/discussion/mayo-clinic-ai-detects-pancreatic-cancer-up-to-3-ye...
2•moneil971•18m ago•0 comments

So You Wanna Build an App

https://www.brentozar.com/archive/2026/04/so-you-wanna-build-an-app/
1•speckx•21m ago•0 comments

A Dungeon Master as a long-horizon agent

https://h-tu.ch/blog/dungeon-master-long-horizon-agent/
1•htuch•21m ago•0 comments

In Verona, Wisconsin you will find one of the biggest tech campuses in the U.S.

https://freakonomics.com/podcast/what-makes-judy-faulkner-run/
1•jedberg•21m ago•1 comments

An "Observatory" for a Shy Super AI?

https://robreid.substack.com/p/an-observatory-for-a-shy-super-ai
1•MaysonL•21m ago•0 comments

Scaling Pain of Coding Agent Serving: Lessons from Debugging GLM-5 at Scale

https://z.ai/blog/scaling-pain
1•pbowyer•22m ago•0 comments

Show HN: A programming language where the only token is the word "vibe"

https://wevibe.fyi
1•bonchicbongenre•22m ago•0 comments

HL7 MLLP Mock Server

https://github.com/novalagung/hl7-mllp-mock-server
1•novalagung•22m ago•1 comments