frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

CATL secures 60 GWh sodium-ion battery order with Hyperstrong

https://www.pv-magazine.com/2026/04/28/catl-secures-worlds-largest-sodium-ion-battery-order-with-...
1•konschubert•1m ago•0 comments

How well does S3 checkpointing hold up when running Airflow on spot?

https://spot.rackspace.com/blog/building-fault-tolerant-airflow-pipelines-on-spot-infrastructure
1•aleroawani•1m ago•1 comments

Why the Chicken Crossed the Road, According to Various Entities

https://dynomight.net/chicken/
1•webninja•2m ago•0 comments

First Amendment advocates blast the FCC's early review of ABC broadcast licenses

https://www.nbcnews.com/news/us-news/first-amendment-advocates-blast-fccs-early-review-abc-broadc...
2•ceejayoz•2m ago•0 comments

AMD's Lemonade SDK 10.3 Now 10x Smaller by Getting Rid of Electron

https://www.phoronix.com/news/AMD-Lemonade-10.3
1•canucker2016•3m ago•1 comments

Ask HN: What happens when you paste a screenshot, and ask questions in LLM?

1•orsenthil•3m ago•0 comments

Building simulations and/or digital twins with AI

https://github.com/plugboard-dev/plugboard
1•tjc45•5m ago•1 comments

Show HN: PeopleMesh, Semantic Search for People

https://github.com/francescopace/peoplemesh
1•francescopace•8m ago•0 comments

BP profits more than double as Iran war sends oil prices higher

https://www.bbc.com/news/articles/c2eveyvgn9no
3•breve•15m ago•0 comments

Trillions per Second

https://github.com/c4chaos-io/trillions-per-second
1•kordlessagain•16m ago•1 comments

Japan Airlines trials humanoid robots as ground handlers

https://www.bbc.com/news/articles/cpwp87j1llvo
2•breve•19m ago•0 comments

FDA turns up heat on Amgen, proposing to rescind approval of Tavneos

https://www.fiercepharma.com/pharma/fda-puts-more-heat-amgen-proposing-rescind-approval-tavneos
1•randycupertino•20m ago•1 comments

Reviving Teletext for Ham Radio

https://spectrum.ieee.org/reviving-teletext-for-ham-radio
1•austinallegro•20m ago•0 comments

How electronic warfare is sowing confusion in cockpits

https://www.cnn.com/2026/04/28/science/gps-jamming-plane-navigation-problems
1•breve•21m ago•0 comments

Lore – A Chrome extension that passively saves the articles you read

https://lore-a0x.pages.dev
1•Jordanhydn•22m ago•0 comments

Unfounded Health Concerns Are Powering a Solar Backlash

https://www.propublica.org/article/michigan-solar-farms-health-concerns-st-clair-county
2•mikhael•24m ago•1 comments

A way out of US debt?

https://www.warman.life/blog/2026-04-26-the-synthetic-buyer/
3•shaunistyping•25m ago•1 comments

OpenAI Reportedly Working on an AI Smartphone to Rival iPhone

https://www.macrumors.com/2026/04/27/openai-working-on-an-ai-smartphone/
6•mgh2•26m ago•2 comments

Pancreatic Cancer Study Retracted over Undisclosed Conflict of Interest

https://globalportalnews.com/spain-culture-entertainment-news/mariano-barbacid-pancreatic-cancer-...
1•wslh•27m ago•0 comments

I Won a Championship That Doesn't Exist

https://ron.stoner.com/How_I_Won_a_Championship_That_Doesnt_Exist/
12•SEJeff•27m ago•0 comments

Pentagon seeks to codify Department of War title as renaming costs total $50M

https://www.stripes.com/theaters/us/2026-04-28/pentagon-congress-codify-dow-name-21516668.html
4•Bender•31m ago•0 comments

Disaggregated Serving for Hybrid SSM Models in vLLM

https://vllm-website-lx4pji0mz-inferact-inc.vercel.app/blog/hybrid-ssm-disagg
1•matt_d•32m ago•0 comments

Show HN: Effected Keyboard 2 – Effects as You Type

2•vitalipom•33m ago•0 comments

Drone pilot makes US rescind no-fly zones around unmarked, moving ICE vehicles

https://arstechnica.com/gadgets/2026/04/no-fly-zones-around-moving-ice-vehicles-this-drone-pilot-...
19•Bender•35m ago•3 comments

King Charles state visit to US

https://www.bbc.co.uk/news/live/c4g5lly7qg8t
2•FridayoLeary•35m ago•0 comments

Flesh-eating bacteria devour man's arm and leg in just three days

https://arstechnica.com/health/2026/04/flesh-eating-bacteria-devour-mans-arm-and-leg-in-just-thre...
4•Bender•35m ago•0 comments

Mad Bugs: QEMU and UTM Escape

https://blog.calif.io/p/mad-bugs-qemu-and-utm-escape
1•wslh•37m ago•0 comments

Post-trained Qwen3-Coder with a debugger: 70% → 89% solve rate, 59% fewer turns

https://twitter.com/moofeez/status/2049192929739280482
4•moofeez•38m ago•1 comments

Show HN: My friend and his AI homies wrote SGI Indy emulator in Rust

https://github.com/techomancer/iris
2•greg_w•39m ago•0 comments

Release PiClaw v2.0.4 – Chapek 9 · rcarmo/piclaw

https://github.com/rcarmo/piclaw/releases/tag/v2.0.4
1•rcarmo•40m ago•0 comments