frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: BDR – A Type-Safe, Cucumber-Free BDD Template for Playwright

https://github.com/dmitryAQA/playwright-bdr-template
1•dmitryaqa•32s ago•0 comments

Google has seriously leaned into AI enshittification lately

https://www.theregister.com/ai-ml/2026/05/25/google-has-seriously-leaned-into-ai-enshittification...
1•sbulaev•40s ago•0 comments

MinimAIlist OS (MOS) – A Manifesto for a Post-Legacy Operating System

https://github.com/pulstar/mos
1•PulStar•45s ago•0 comments

Show HN: Interactive animated walkthroughs of Kubernetes internals

https://explained.kubesimplify.com/
1•saiyampathak•45s ago•0 comments

Ask HN: Do you embrace AI in your life and business?

1•drunx•1m ago•0 comments

How the Iran War Could Threaten Global Internet Access

https://time.com/article/2026/05/19/iran-war-subsea-cables-internet-strait-hormuz-gulf-states-ai/
1•giuliomagnifico•1m ago•0 comments

Memelang: Token-Terse Query Language

https://memelang.net/11/
1•bri-holt•2m ago•0 comments

Mnemosyne – Memory for AI Hermes Agents, Sub-Millisecond Recalls, Local First

https://mnemosyne.site/
2•AbdiiSan•6m ago•0 comments

Show HN: Proj – organize your coding projects with categories and one-key CD

2•whizhuii•6m ago•0 comments

Agentic AI Design Patterns for Developers (2026)

https://learnagenticpatterns.com
2•ankitg12•7m ago•0 comments

Local-First Twitter Workspace

https://birdclaw.sh/
2•cat-whisperer•8m ago•0 comments

Beyond Senior: Consider the staff path

https://hawksley.org/2026/01/14/beyond-senior.html
3•RyeCombinator•8m ago•0 comments

Your Function's Doppelgänger (Fenchel Conjugate)

https://fedemagnani.github.io/math/2025/07/04/fenchel.html
2•drunello•9m ago•0 comments

Spec-Drive Development (SDD) compressed with math-glyphs

https://lab5.ca/blog/spec-driven-development/
2•kborovik•10m ago•0 comments

Awesome: Lists about all kinds of interesting topics

https://github.com/sindresorhus/awesome
2•danborn26•12m ago•0 comments

Show HN: I made a compiler/VM for untrusted scripts

https://autolang.vercel.app/docs/philosophy-vision
2•hoansdz•13m ago•0 comments

Nendo's Wonderful Toru, an Electric Kettle for Alessi

https://www.core77.com/posts/143823/Nendos-Wonderful-Toru-an-Electric-Kettle-for-Alessi
2•surprisetalk•14m ago•0 comments

Search engines alternatives now that Google isn't Google anymore

https://techcrunch.com/2026/05/21/six-search-engines-worth-trying-now-that-google-isnt-really-goo...
18•elorant•17m ago•5 comments

Happy Towel Day HN

https://en.wikipedia.org/wiki/Towel_Day
2•salutis•23m ago•0 comments

PEEK: Give Your Agent an Orientation Cache (MIT CSAIL, Khattab group)

https://zhuohangu.github.io/blog-post-peek/
3•galsapir•24m ago•0 comments

Ask HN: Local model experiences with 'high-reasoning distill' finetunes

2•sleepyeldrazi•24m ago•0 comments

Agents Just Need APIs

https://agent-data.dev/blog/benchmarking-ai-agent-web-access/
3•jb_hn•25m ago•0 comments

Velocity in Every Voxel – Perception in Robotics

https://atomsfrontier.substack.com/p/velocity-in-every-voxel
2•jpatel3•26m ago•0 comments

Facebook's Flow is being ported to Rust

https://github.com/facebook/flow/blob/main/rust_port/rust_port_status.md
2•mirekrusin•27m ago•0 comments

Pope Leo warns AI revolution driven by 'idolatry of profit'

https://www.ft.com/content/12313f08-991d-4079-9631-9ce7ae70c3e3
3•1vuio0pswjnm7•27m ago•0 comments

Ask HN: Did agentic coding change the way you think about commit granularity?

1•luodaint•27m ago•0 comments

Show HN: Stumpy – StumbleUpon Re-Created

https://chromewebstore.google.com/detail/stumpy/blfpeiakahiemhdiaaaacfgmdmdocfem
2•postatic•27m ago•0 comments

China launches 'human artificial embryos' into space: off-world reproduction?

https://www.livescience.com/space/space-exploration/china-launches-human-artificial-embryos-to-sp...
1•bookofjoe•27m ago•0 comments

Code Golf

https://code.golf/
1•alabhyajindal•28m ago•0 comments

Unitree Develops Production-Ready Mech Suit

https://www.core77.com/posts/144165/Unitree-Develops-Production-Ready-Mech-Suit
1•surprisetalk•29m ago•0 comments