frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

The Law of Leaky Abstractions (2002)

https://www.joelonsoftware.com/2002/11/11/the-law-of-leaky-abstractions/
1•tosh•55s ago•0 comments

Show HN: Nativeblocks – Code-push for native Kotlin and Swift apps

https://nativeblocks.io/
1•alirezat775•57s ago•0 comments

The Code's the Thing

https://www.jimgumbley.com/blog/the-codes-the-thing.html
1•LeonigMig•1m ago•0 comments

DriftLens integrates the practice of self-observation

https://driftlens.substack.com/p/grounded-in-monastic-self-observation
1•driftlensOS•1m ago•0 comments

Gemini Omni

https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/
1•pretext•2m ago•0 comments

Potemkin "Independence" May Be the Future of the Fed in Bank Regulation

https://toddhbaker.substack.com/p/potemkin-independence-may-be-the
1•petethomas•5m ago•0 comments

Has the Anthropic Settlement Changed Everything?

https://writerbeware.blog/2026/05/22/has-the-anthropic-settlement-changed-everything/
1•speckx•5m ago•0 comments

Show HN: CoreMem – Portable context for AI agents

https://coremem.app
2•20wenty•8m ago•0 comments

2-time NASCAR champ Kyle Busch dies at 41 after a 'severe illness'

https://www.washingtonpost.com/sports/auto-racing/2026/05/21/nascar-kyle-busch-hospitalized/bb0c5...
1•bookofjoe•8m ago•1 comments

Phind.com has shut down completely

https://old.reddit.com/r/LocalLLaMA/comments/1qfbt9f/local_replacement_for_phindcom/
3•behnamoh•12m ago•0 comments

Seven days of fasting transforms the human body

https://www.sciencedaily.com/releases/2026/05/260517030404.htm
1•ketanmaheshwari•13m ago•0 comments

Vulnerabilities in various GTK-based PDF readers

https://lwn.net/Articles/1073944/
2•Brajeshwar•14m ago•0 comments

Tulsi Gabbard resigns as US director of national intelligence

https://www.bbc.com/news/articles/cvgj2gkv1x1o
5•thm•15m ago•1 comments

Cartesia's Sonic-3.5 Takes #1 on Artificial Analysis Speech Leaderboard

https://artificialanalysis.ai/text-to-speech/leaderboard
2•ganeshmm•16m ago•0 comments

Commencement Speeches

https://apps.npr.org/commencement/
1•gcanyon•16m ago•0 comments

Thinking in an Array Language

https://github.com/razetime/ngn-k-tutorial/blob/main/12-thinking-in-k.md
3•tosh•18m ago•0 comments

Sundar Pichai discusses AI search

https://www.nytimes.com/2026/05/22/podcasts/sundar-pichai-understands-why-people-are-anxious-abou...
2•garyrob•19m ago•0 comments

Bumblebee: Read-only supply-chain inventory for macOS/Linux dev machines.

https://github.com/perplexityai/bumblebee
1•georgehill•21m ago•0 comments

Interim Install Guide: KDE Neon for a professional digital painter workstation

https://www.davidrevoy.com/article1145/interim-install-guide-kde-neon-user-edition-for-a-professi...
2•Tomte•22m ago•0 comments

Immigrants waiting for a Green Card must return to their home country to apply

https://twitter.com/DHSgov/status/2057817233200418837
3•freddier•22m ago•0 comments

AVX-512 and Validating Usage on AMD EPYC

https://www.amd.com/en/blogs/2026/understanding-avx-512---validating-usage-on-amd-epyc-.html
1•tosh•22m ago•0 comments

Canada's shortwave radio time standard station CHU to go dark June 22nd

https://nrc.canada.ca/en/certifications-evaluations-standards/canadas-official-time/nrc-shortwave...
2•joe_bleau•24m ago•0 comments

Kash Patel merch site hacked to trick users into installing malware

https://san.com/cc/kash-patels-personal-merch-site-hacked-to-trick-users-into-installing-malware/
4•impish9208•27m ago•1 comments

India's quiet redrawing of research integrity's accountability chain

https://www.researchinformation.info/analysis-opinion/indias-quiet-redrawing-of-research-integrit...
1•rustoo•28m ago•0 comments

Lattice: Grid-based space navigation for macOS

https://github.com/bryancostanich/lattice
2•keithba•28m ago•1 comments

Microsoft Drops Claude Code After Budget Overrun

https://aiweekly.co/alerts/microsoft-drops-claude-code-after-budget-overrun
53•robertkarl•29m ago•19 comments

TorQ: Kdb+ Production Framework

https://github.com/DataIntellectTech/TorQ
2•tosh•29m ago•0 comments

API to fix telemetry pathologies and calculate invariant proxies

https://dashboard.render.com/web/srv-d888rk8g4nts73et5fv0/deploys/dep-d888rkog4nts73et5ghg?r=2026...
1•Oliviana•30m ago•0 comments

CRTC to require streamers pay 15% of annual rev to support Canadian content

https://nationalpost.com/news/crtc-to-require-online-streamers-to-pay-15-of-annual-revenues-to-su...
1•fidotron•31m ago•0 comments

Why Ruby Still Feels Like Home After All These Years

https://caio.ca/blog/why-ruby-still-feels-like-home
3•birdculture•31m ago•0 comments