frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: VPets.met – a cozy pixel pet world

https://vpets.net/start
1•solidarnosc•14s ago•0 comments

Nemotron 3 Ultra is open weight and open data [video]

https://www.youtube.com/watch?v=D8LIIvQVGS4
1•TheJCDenton•1m ago•0 comments

The First AI QFT Textbook

https://www.math.columbia.edu/~woit/wordpress/?p=15735
1•jjgreen•5m ago•0 comments

Data centers consumed 264B gallons of water as drought hits nearly 63% of US

https://www.barchart.com/story/news/2339834/ai-data-centers-water-consumption-breaks-264-billion-...
1•yogthos•5m ago•0 comments

Compression and Intelligence [video]

https://www.youtube.com/watch?v=l6DKRf-fAAM
2•2bird3•6m ago•0 comments

Painting that made Turner's name gets second public showing since 1799

https://www.thetimes.com/culture/art/article/painting-turner-abergavenny-bridge-rcvx8hglh
1•bookofjoe•6m ago•1 comments

Investing Is Compression

https://arxiv.org/abs/2604.10758
1•lisper•7m ago•0 comments

The Zebra v4.4.1 Chronicles: Independent Audit

https://github.com/Alex74SjS3/THE-ZCASH-ZEBRA-v4.4.1-CHRONICLES
1•Alex74-SjS3•11m ago•0 comments

Spyro the Dragon returns with a new game after almost two decades

https://www.theguardian.com/games/2026/jun/07/spyro-the-dragon-returns-with-a-new-game-after-almo...
1•TechTechTech•14m ago•0 comments

Thoughts on starting new projects with LLM agents

https://eli.thegreenplace.net/2026/thoughts-on-starting-new-projects-with-llm-agents/
2•zdw•15m ago•0 comments

VibeOS: First ever AI-native operating system

https://vibeos.sh/
1•doener•17m ago•0 comments

Flock Safety Price List [pdf]

https://www.omniapartners.com/suppliers-files/E-J/Flock_Safety/Contract_Documents/R250203/5_29_20...
2•ourmandave•19m ago•0 comments

A Portrait of the Software Engineer, 2031

https://jamesjboyer.substack.com/p/a-portrait-of-the-software-engineer
1•aesthetics1•19m ago•0 comments

Ask HN: Is Facebook registration procedure broken?

2•stefanos82•19m ago•0 comments

I built a sentiment analyzer for Hacker News (as an MCP server)

https://mcpize.com/mcp/sentiment-analyzer
1•Lord_Dontavious•20m ago•0 comments

VibeOS – Hallucinated Operating System [video]

https://www.youtube.com/watch?v=z3pV6FHvcgM
2•doener•22m ago•0 comments

Academics set out vision for planetary survival

https://www.theguardian.com/environment/2026/jun/04/world-inequality-lab-equality-academics-plane...
4•worik•25m ago•0 comments

The future is controlled by companies who control the physical bottlenecks of AI

https://silicon-frontier.com/research/silicon-control
1•momentmaker•25m ago•0 comments

Why are there so many canines in fine art?

https://www.theatlantic.com/magazine/2026/07/the-dogs-gaze-thomas-w-laqueur/687312/
1•prismatic•25m ago•0 comments

Got a job, dropped this for 3 months – MaskOps, Polars PII masking in Rust

https://github.com/fcarvajalbrown/MaskOps
1•fcarvajalbrown•27m ago•0 comments

1D Image Tokenizers and Autoregressive Models for Dynamic Resolution Generations

https://arxiv.org/abs/2604.24885
1•PaulHoule•28m ago•0 comments

Expert Selections in MoE Transformer Models Reveal Almost as Much as Text

https://arxiv.org/abs/2602.04105
3•busserweiser•30m ago•0 comments

Small modular nuclear reactor reaches criticality in first test

https://arstechnica.com/science/2026/06/first-us-test-of-modular-reactor-reaches-criticality/
1•NedCode•30m ago•0 comments

NEC PC Engine LT Recap and LCD Bias Fix (Necromancy)

https://hitmanmcc.com/entry/pc-engine-lt-necromancy
1•zdw•32m ago•0 comments

The spelling error made 200B times a day (2025)

https://nbailey.ca/post/spelling-error/
2•NaOH•33m ago•0 comments

The US Only Has One Political Party [video]

https://www.youtube.com/watch?v=GUVf6DkDkgA
1•joe_mamba•33m ago•0 comments

Show HN: Claude Code on Slack/Discord/Telegram for flat $20/mo – no API bills

https://lobsteady.com
1•jvalansi•35m ago•0 comments

How much do amd64 microarchitecture levels help in Go?

https://lemire.me/blog/2026/06/06/how-much-do-amd64-microarchitecture-levels-help-in-go/
1•zdw•36m ago•0 comments

Why add an agent skill to a CLI that has a context command?

https://www.andreagrandi.it/posts/why-add-agent-skill-cli-context-command/
2•andreagrandi•40m ago•0 comments

Robotics Has a Stiffness Problem

https://hasalmon.medium.com/the-stiffness-problem-part-1-ed44c68e56b6
4•E-•41m ago•0 comments