frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Snail Mail Sign-Ups

https://btxx.org/posts/snail-mail-signups/
1•bradley_taunt•3m ago•0 comments

After building agents for 2 years, I stopped using function calling

https://old.reddit.com/r/LocalLLaMA/comments/1rrisqn/i_was_backend_lead_at_manus_after_building_a...
1•gmays•10m ago•0 comments

Skill Issue: Harness Engineering for Coding Agents

https://www.humanlayer.dev/blog/skill-issue-harness-engineering-for-coding-agents
1•vinhnx•12m ago•0 comments

Professor predicts US goes to war with Iran and loses (9 months ago) [video]

https://www.youtube.com/watch?v=7y_hbz6loEo
1•0xbadcafebee•15m ago•0 comments

Rotating home owners boast of 360-degree views and energy benefits

https://www.abc.net.au/news/2026-03-14/rotating-home-owners-boast-of-360-degree-views/106348854
1•defrost•16m ago•0 comments

Show HN: DAAO – Deploy AI agents to your servers via Zero-Trust tunnels

https://github.com/daao-platform/daao
1•dan3093•27m ago•0 comments

The Dirty, Dystopian World of AI Data Centers

https://www.theatlantic.com/magazine/2026/04/ai-data-centers-energy-demands/686064/
1•chrisaycock•27m ago•0 comments

Federal Judge Quashes Justice Department Subpoenas of Fed Chair Jerome Powell

https://www.cnn.com/2026/03/13/politics/fed-chair-jerome-powell-subpoena
3•rickcarlino•29m ago•0 comments

Charles B. McVay III

https://en.wikipedia.org/wiki/Charles_B._McVay_III
2•thunderbong•31m ago•0 comments

Silicon Valley is buzzing about this new idea: AI compute as compensation

https://www.businessinsider.com/ai-compute-compensation-software-engineers-greg-brockman-2026-3
1•pabs3•32m ago•0 comments

Arno's Engram Keyboard Layouts

https://github.com/binarybottle/engram
1•so-cal-schemer•34m ago•1 comments

Rust Shined over Python for My CLI Tool

https://smiling.dev/blog/rust-shined-over-python-for-my-cli-tool/
1•vinhnx•35m ago•0 comments

Companies That Should Exist

https://anastasiagamick.substack.com/p/companies-that-should-exist
1•paulpauper•35m ago•0 comments

How Does AI Distribute the Pie? Large Language Models and the Ultimatum Game

https://www.nber.org/papers/w34919
1•paulpauper•36m ago•0 comments

Buying Back Our Slack

https://www.seeingthesystem.com/p/buying-back-our-slack
1•paulpauper•36m ago•0 comments

Agentic AI: Workflows vs. Agents

https://www.youtube.com/watch?v=Qd6anWv0mv0
1•Brysonbw•37m ago•0 comments

MacBook Neo Teardown [video]

https://www.youtube.com/watch?v=PbPCGqoBB4Y
1•Lwrless•37m ago•0 comments

Trump now Selling National Security Briefing Membership

https://www.cnn.com/2026/03/13/politics/trump-fundraise-email-soldier
9•mandeepj•38m ago•1 comments

Big tech engineers need big egos

https://www.seangoedecke.com/big-tech-needs-big-egos/
1•jnord•40m ago•0 comments

Void – Ship Vite apps at warp speed

https://void.cloud
1•todotask2•42m ago•0 comments

1997 Kyl–Bingaman Amendment prohibits high res satellite imagery of Israel

https://en.wikipedia.org/wiki/Kyl%E2%80%93Bingaman_Amendment
3•spaghetdefects•44m ago•0 comments

Show HN: Paw-proxy – Named HTTPS domains for every local dev server

https://alexcatdad.github.io/paw-proxy/
1•alex_tc•50m ago•0 comments

Show HN: What agentic collaborative pentesting looks like

https://www.youtube.com/watch?v=PU5BicXMiio
1•integsec•50m ago•0 comments

JPEG Compression

https://www.sophielwang.com/blog/jpeg
2•vinhnx•50m ago•0 comments

Want to hack your body with peptides? If only the science agreed

https://www.economist.com/science-and-technology/2026/03/11/want-to-hack-your-body-with-peptides-...
2•andsoitis•54m ago•0 comments

Biomass-based furan epoxies with high-performance and closed-loop recyclability

https://www.sciencedirect.com/science/article/pii/S1359836825011722
2•PaulHoule•57m ago•0 comments

US withdraws draft rule that called for global AIchip permits

https://www.bloomberg.com/news/articles/2026-03-14/us-withdraws-draft-rule-that-called-for-global...
2•htrp•57m ago•0 comments

Security Layer for Claude Code

https://www.oculisecurity.com/
1•rellaElla•59m ago•0 comments

Ask HN: Why can't we just make more RAM?

5•chatmasta•1h ago•6 comments

TB Eradicator: Space Invaders but the Enemies Are TB Bacteria

https://tberadicator.com
2•YossarianFrPrez•1h ago•0 comments