frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•7mo ago

Comments

tocs3•7mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

A $2,500 full body scan said he was healthy. Then he had a catastrophic stroke

https://www.washingtonpost.com/health/2026/01/13/prenuvo-lawsuit-full-body-scan/
1•ThePhantom•3m ago•0 comments

Show HN: A slsqp solver WASM demo

https://slsqp-wasm.shuo23333.app/
1•shuoli84•4m ago•0 comments

Minimal Claude Code in 250 lines

https://github.com/1rgs/nanocode
1•yuedongze•4m ago•0 comments

Show HN: Axis – A systems programming language with Python syntax

https://github.com/AGDNoob/axis-lang
1•AGDNoob•5m ago•1 comments

Minor says ICE took his iPhone, later found in used-electronics vending machine

https://www.propublica.org/article/videos-ice-dhs-immigration-agents-using-chokeholds-citizens
2•spenvo•5m ago•1 comments

The Mole: Infiltrating North Korea [video]

https://www.youtube.com/watch?v=Sgq-_VNptSc
2•matousd•8m ago•0 comments

South Korea seeks death penalty for ex-president Yoon over martial law bid

https://www.cnbc.com/2026/01/14/south-korea-special-prosecutor-seek-death-penalty-former-presiden...
1•nodesocket•9m ago•0 comments

Show HN: Cymatica – Chladni Plate Simulator

https://www.cymatica.app/
1•_august•13m ago•0 comments

US approves sale of Nvidia's advanced H200 chips to China

https://www.bbc.com/news/articles/cg4erx1n04lo
2•tlyleung•17m ago•0 comments

QuickSend – Share Files Between Devices Without the Friction

http://quicksend.chat
1•foodhome•20m ago•1 comments

Wrapping my head around Gas Town

https://justin.abrah.ms/blog/2026-01-05-wrapping-my-head-around-gas-town.html
2•gmays•25m ago•1 comments

Bottom-up programming as the root of LLM dev skepticism

https://www.klio.org/theory-of-llm-dev-skepticism/
1•mkozlows•26m ago•0 comments

Show HN: Neutriva – A personalized health and wellness tracking assistant

https://neutriva.com/en/wellness-assistant
1•NoraWW•28m ago•0 comments

Medical Groups Will Try to Block Childhood Vaccine Recommendations

https://www.nytimes.com/2026/01/13/health/vaccine-schedule-children-kennedy.html
3•doener•37m ago•0 comments

Terry Tao: "LLMs Are Simpler Than You Think – The Real Mystery Is Why They Work" [video]

https://www.youtube.com/watch?v=ukpCHo5v-Gc
8•gmays•37m ago•0 comments

Starlink Users in Iran Get Free Internet Access, Nonprofit Says

https://www.nytimes.com/2026/01/13/technology/iran-starlink-elon-musk.html
4•doener•39m ago•2 comments

One Simple Arrow Changed Automobiles Forever [video]

https://www.wsj.com/video/series/on-the-news/how-one-simple-arrow-changed-automobiles-forever/C33...
1•fortran77•41m ago•0 comments

Qualcomm's RISC-Ventana Fusion

https://thechipletter.substack.com/p/qualcomms-risc-ventana-fusion
1•chmaynard•41m ago•0 comments

What's Ahead: Alien Processes, Domains, and Data Models

https://practicaldatamodeling.substack.com/p/whats-ahead-alien-processes-domains
1•gmays•43m ago•0 comments

Why IRC is better than Real Life

https://everything2.com/node/e2node/Why%20IRC%20is%20better%20than%20Real%20Life
2•jskherman•44m ago•0 comments

A new generation of Chinese companies is expanding around the world

https://www.economist.com/business/2026/01/13/a-new-generation-of-chinese-companies-is-expanding-...
3•petethomas•47m ago•0 comments

Southern New Zealand hospitals experienced major IT outage

https://www.rnz.co.nz/news/national/584026/public-service-association-says-southern-hospitals-exp...
3•billybuckwheat•47m ago•0 comments

What will enshittification of LLMs look like?

1•scoofy•47m ago•3 comments

An Updated Dentist Office Software Story

https://avc.xyz/an-updated-dentist-office-software-story
1•turadg•49m ago•1 comments

BioNTech Provides Strategic Business Update and Outlines 2026

https://investors.biontech.de/news-releases/news-release-details/biontech-provides-strategic-busi...
2•doener•51m ago•0 comments

How to Store the Web in S3

https://exa.ai/blog/exa-d
1•willbryk•52m ago•1 comments

Signal creator Moxie Marlinspike wants to do for AI what he did for messaging

https://arstechnica.com/security/2026/01/signal-creator-moxie-marlinspike-wants-to-do-for-ai-what...
5•abolishme•54m ago•0 comments

Lawsuit: DHS wants "unlimited subpoena authority" to unmask ICE critics

https://arstechnica.com/tech-policy/2026/01/instagram-user-fights-dhs-for-the-right-to-post-ice-s...
6•duxup•56m ago•0 comments

OpenAI buys tiny health records startup Torch for, reportedly, $100M

https://techcrunch.com/2026/01/12/openai-buys-tiny-health-records-startup-torch-for-reportedly-100m/
2•nsoonhui•56m ago•0 comments

China's durian craze has turned this tropical fruit into a tool of diplomacy

https://theconversation.com/chinas-durian-craze-has-turned-this-tropical-fruit-into-a-tool-of-dip...
2•PaulHoule•57m ago•0 comments