frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Axavive Official Site 2026: Pricing, Bonuses, and 90-Day Guarantee

https://finance.yahoo.com/sectors/healthcare/articles/axavive-skin-exploding-2026-golden-22590060...
1•farjzaty•48s ago•0 comments

Krytonum – a bare-metal C++ OS built from scratch

https://github.com/Velo-Computing-Technologies/Krytonum
1•hs100000•1m ago•0 comments

The Nature of Programming

https://medium.com/@tautvilas/what-is-programming-22a72ef4fd02
1•brisky•1m ago•0 comments

Show HN: When the LLM Accidentally

1•lucid-dev•2m ago•0 comments

Building an In-Person System (NYC)

https://www.notion.so/Building-an-In-Person-System-NYC-354c7880b1938067843defd62925fec4?source=co...
1•field-trace•3m ago•0 comments

Show HN: SafeKibble – free email alerts when your pet's food is FDA-recalled

https://www.safekibble.app/
1•akest•4m ago•0 comments

Pasilalinic-Sympathetic Compass

https://en.wikipedia.org/wiki/Pasilalinic-sympathetic_compass
1•canjobear•5m ago•0 comments

Pay and Sit – The Coin-Operated Private Park Bench

https://www.boredpanda.com/pay-sit-the-private-bench/
2•xeonmc•9m ago•1 comments

Mojo v1.0.0b1

https://mojolang.org/releases/v1.0.0b1/
1•signa11•12m ago•0 comments

Putin's Strongman Image Is Fading as Ukraine Brings War Home to Russia

https://www.wsj.com/world/russia/putins-strongman-image-is-fading-as-ukraine-brings-war-home-to-r...
3•JumpCrisscross•15m ago•0 comments

Why Medical Records Requests Get Delayed and How to Fix It

https://boldsign.com/blogs/medical-records-requests-esignatures/
1•Rachel_Atieno•17m ago•0 comments

HantaWatch Real time hantavirus outbreak tracker

https://hantawatch.net/
2•Accher•18m ago•0 comments

Ask HN: Does Codex hits limits more easily now?

1•endorphine•22m ago•0 comments

BotScript – a TypeScript superset for code mostly written by bots

https://github.com/marcelofarias/botscript
2•mfarias•23m ago•0 comments

Positive Corner

1•Kendi_b•41m ago•2 comments

DBase: 1979-2026

https://delphinightmares.substack.com/p/dbase-1979-2026
3•deeaceofbase•42m ago•2 comments

Ask HN: How are you sandboxing AI agents and developer CLIs?

1•nikhilpareek13•43m ago•0 comments

Jon Rubinstein: Former Apple Hardware Chief on Saving Apple, iMac, iPod, iPhone [video]

https://www.youtube.com/watch?v=PvFMT58lgvk
1•anotherhue•45m ago•0 comments

Webdevbench: Evaluating AI as software development agencies

https://webdevbench-ai-benchmarks.qwikbuild.site/
1•nileshtrivedi•48m ago•0 comments

The AI Revival of the Three Mile Island Nuclear Plant

https://www.bloomberg.com/news/features/2026-05-07/three-mile-island-restart-moves-ahead-with-mic...
1•petethomas•55m ago•0 comments

Blaise – A modern self-hosting zero-legacy Object Pascal compiler targeting QBE

https://github.com/graemeg/blaise
3•peter_d_sherman•58m ago•0 comments

Wasabi: Native WebSocket and MQTT 5 for VBA using Assembly thunks

https://github.com/uesleibros/wasabi
3•UesleiDev•1h ago•0 comments

The case of Canvas–Longitudinal datafication through learning management systems

https://www.researchgate.net/publication/341046070_The_case_of_Canvas_Longitudinal_datafication_t...
2•droidjj•1h ago•0 comments

RubyLLM 1.15: Image Editing, Cost Tracking and Less Tool Boilerplate

https://paolino.me/rubyllm-1-15/
2•thunderbong•1h ago•0 comments

Honest conversation with Bob Weiner about GNU Hyperbole

https://www.youtube.com/watch?v=iuwn8GpRj7w
1•iLemming•1h ago•0 comments

Thinking about custom software in a new way

https://jerodsanto.net/2026/01/thinking-about-custom-software-in-a-new-way/
1•fagnerbrack•1h ago•0 comments

Agentic Engineering

https://addyosmani.com/blog/agentic-engineering/
20•fagnerbrack•1h ago•1 comments

Widget (Beer)

https://en.wikipedia.org/wiki/Widget_(beer)
2•fortran77•1h ago•1 comments

Dissolving arterial plaque by isolating the vessel from blood flow

https://www.cureus.com/articles/488870-closed-loop-extracorporeal-vascular-cleaning-by-staged-che...
1•iliatoli•1h ago•0 comments

Cisco to buy unit 8200 affiliated company that tracks all your API keys

https://www.timesofisrael.com/us-tech-giant-cisco-buys-israeli-ai-cyber-startup-to-protect-digita...
2•neuroelectron•1h ago•0 comments