frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Purely Vibe Coded Asmongold Simulator

https://spirofloropoulos.com/asmongold_simulator/
1•spirodonfl•3m ago•0 comments

The Consequences of the Epstein Document Release Start to Pile Up

https://www.nationalreview.com/the-morning-jolt/the-consequences-of-the-epstein-document-release-...
1•petethomas•3m ago•0 comments

Major PC OEMs Reportedly Exploring Chinese CXMT Memory Amid Shortages

https://www.techpowerup.com/346035/major-pc-oems-reportedly-exploring-chinese-cxmt-memory-amid-sh...
1•walterbell•4m ago•0 comments

Agent-evals: Overlap, boundary, and metacognitive scoring for coding agents

https://thinkwright.ai/agent-evals
1•oceanwaves•5m ago•0 comments

Why Affordability and the Vibecession Are Real Economic Problems

https://newsletter.mikekonczal.com/p/why-affordability-and-the-vibecession
1•NomNew•6m ago•0 comments

Hard Drive Prices Unexpectedly Rise in 2026

https://gettingwin.com/industry-information/592.html
2•AndrejXY•7m ago•0 comments

Manage Your Dotfiles with Stow

https://www.gnu.org/software/stow/manual/stow.html
1•ddtaylor•10m ago•0 comments

Show HN: The first financial intelligence MCP server live trading signals Claude

https://web-production-71423.up.railway.app/mcp-server
1•Shmungus•15m ago•0 comments

Show HN: Forage – MCP server that lets AI agents find and install their own MCPs

https://github.com/isaac-levine/forage
1•DoomedWheel1027•16m ago•1 comments

AI as Exoskeleton

https://clabs.org/blog/AiAsExoskeleton
1•the_chrismo•19m ago•1 comments

A.I. Salaries Are Causing Couples to Rethink Money in Relationships

https://www.nytimes.com/2026/02/14/business/artificial-intelligence-relationships-income-gap.html
2•mooreds•23m ago•1 comments

Sub-second volumetric 3D printing by synthesis of holographic light fields

https://www.nature.com/articles/s41586-026-10114-5
2•westurner•25m ago•0 comments

EU bans AI use on government work devices

https://www.neowin.net/news/eu-parliament-bans-ai-use-on-government-work-devices/
3•bundie•26m ago•1 comments

Filkoll – The fastest command-not-found handler (2025)

https://vorpal.se/posts/2025/mar/25/filkoll-the-fastest-command-not-found-handler/
1•crispinh•28m ago•0 comments

The Death of Traditional Testing

https://engineering.fb.com/2026/02/11/developer-tools/the-death-of-traditional-testing-agentic-de...
1•manveerc•31m ago•0 comments

Apple Begins Testing End-to-End Encryption for RCS Messages in iOS 26.4 Beta

https://www.macrumors.com/2026/02/16/ios-26-4-rcs-encryption-testing/
5•contact9879•32m ago•0 comments

Meta is wrong to try to sneak into facial recognition with Ray-Ban glasses

https://www.bloomberg.com/opinion/articles/2026-02-16/meta-is-wrong-to-try-to-sneak-into-facial-r...
3•socialcommenter•33m ago•4 comments

Access public data insights faster: Data Commons MCP is now hosted on GCloud

https://developers.googleblog.com/access-public-data-insights-faster-data-commons-mcp-is-now-host...
2•manveerc•35m ago•1 comments

I built a tool for software developers

https://techstack.sh/
2•harrypotterwish•38m ago•1 comments

Frederick Wiseman, 96, Penetrating Documentarian of Institutions, Dies

https://www.nytimes.com/2026/02/16/movies/frederick-wiseman-dead.html
2•mhb•40m ago•0 comments

Poor Deming never stood a chance

https://surfingcomplexity.blog/2026/02/16/poor-deming-never-stood-a-chance/
2•todsacerdoti•41m ago•0 comments

Introducing Package Chaos Monkey

https://nesbitt.io/2026/01/26/introducing-package-chaos-monkey.html
2•pabs3•45m ago•0 comments

Facing a demographic catastrophe, Ukraine is paying for troops to freeze sperm

https://www.bbc.com/news/articles/cqxd9549y4xo
17•tartoran•49m ago•4 comments

Fixapl

https://fixapl.netlify.app/
1•todsacerdoti•51m ago•0 comments

Show HN: Constrained DSL for Reliable LLM Decisions

https://github.com/myinvestpilot/ai-architecture/blob/main/docs/01_ai_native_primitives_engine.md
1•madawei2699•52m ago•1 comments

An AI CEO said something honest: ExperiencedDevs

https://old.reddit.com/r/ExperiencedDevs/comments/1r6olcv/an_ai_ceo_finally_said_something_honest/
11•ivewonyoung•55m ago•5 comments

Finding forall-exists Hyperbugs using Symbolic Execution

https://dl.acm.org/doi/full/10.1145/3689761
3•todsacerdoti•57m ago•0 comments

Amazon van gets stuck on Britain's 'most dangerous' mudflat path

https://www.theguardian.com/uk-news/2026/feb/16/amazon-van-stuck-britain-mudflat-path-broomway-th...
2•zeristor•57m ago•1 comments

25 Years of All Your Base Are Belong to Us (Slightly Remastered)

https://www.youtube.com/watch?v=orY1RztncqE
2•decimalenough•58m ago•3 comments

Thinking Hard Burns Almost No Calories–But Destroys Your Next Workout

https://vo2maxpro.com/blog/thinking-hard-burns-no-calories-destroys-workout
2•GoodluckH•59m ago•0 comments