frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: I built a Mac app that converts files when you rename them

1•imfavourite•13s ago•0 comments

SQLi flaw in Elementor Ally plugin impacts 250k+ WordPress sites

https://www.bleepingcomputer.com/news/security/sqli-flaw-in-elementor-ally-plugin-impacts-250k-pl...
1•mikece•1m ago•0 comments

My journey from foreign correspondent to Uber driver in Trump's America

https://stevescherer.substack.com/p/my-journey-from-foreign-correspondent
1•only_in_america•3m ago•0 comments

Show HN: Shadowscan – see what an AI agent can access on your machine

https://github.com/LakshmiSravyaVedantham/shadowscan
1•sravyavedantham•3m ago•1 comments

Claude Code building 100 mini games with one prompt (5.3M tokens)

https://twitter.com/amgauge/status/2031809325375897931
1•august-•4m ago•0 comments

Sensational news. Fintech has published banking secrets for public access

https://sfg.media/en/a/monobank-accuses-client-collaboration-slovenian-flag-mistake/
1•Kizert•5m ago•0 comments

Vitalina: Export Your Apple Health Data as CSV/JSON

https://apps.apple.com/us/app/health-data-exporter-vitalina/id6759179139
2•MegaMaddin•9m ago•1 comments

Digital Democracy

https://calmatters.digitaldemocracy.org/
2•jruohonen•10m ago•0 comments

Kishida Prize crowns wordsmiths in the theater world

https://www.japantimes.co.jp/culture/2026/02/26/stage/kishida-prize-theater-japan/
3•PaulHoule•13m ago•0 comments

Zoox starts mapping Dallas and Phoenix for its robotaxis

https://techcrunch.com/2026/03/09/zoox-starts-mapping-dallas-and-phoenix-for-its-robotaxis/
2•gmays•14m ago•0 comments

Surgical Repair of Collapsed Attention Heads in ALiBi Transformers

https://arxiv.org/abs/2603.09616
3•palmerschallon•14m ago•1 comments

Why Escalation Favors Iran

https://www.foreignaffairs.com/iran/why-escalation-favors-iran
6•decimalenough•15m ago•1 comments

Thanks, ChatGPT

https://www.robpanico.com/articles/display/?entry_short=thanks-chatgpt
2•retrocog•15m ago•1 comments

Wired headphone sales are exploding. What's with the Bluetooth backlash?

https://www.bbc.com/future/article/20260310-wired-headphones-are-better-than-bluetooth
14•billybuckwheat•17m ago•1 comments

Feature Unrequest

https://kudmitry.com/articles/feature-unrequest/
3•skwee357•19m ago•0 comments

OrthoScience – Hybrid search engine for 500K+ orthopedic translational research

https://orthoarchives.com/en/orthoscience/search
2•DrMeric•19m ago•1 comments

Don't post generated/AI-edited comments. HN is for conversation between humans.

https://news.ycombinator.com/newsguidelines.html#generated
174•usefulposter•25m ago•39 comments

An Update from First Board Chair Laurie Leshin

https://www.firstinspires.org/about/press-room/an-update-from-first-board-chair-laurie-leshin
2•ndrake•25m ago•0 comments

Show HN: R2 Desk Pro – a vault-locked desktop client for CF R2 (Tauri/Rust)

https://r2desk.greeff.dev
2•pio_greeff•25m ago•0 comments

A record share of U.S. workers now have access to paid leave

https://19thnews.org/2026/03/paid-leave-policies-united-states/
3•mooreds•26m ago•0 comments

I'm glad the Anthropic fight is happening now

https://www.dwarkesh.com/p/dow-anthropic
3•emschwartz•26m ago•0 comments

Over puppy yoga? Try it with snakes

https://text.npr.org/nx-s1-5743865
2•mooreds•26m ago•0 comments

Do You Need to Wash New Clothes Before Wearing Them?

https://www.nytimes.com/2026/03/10/well/wash-new-clothes-before-wearing.html
3•mooreds•26m ago•0 comments

Buying a Laptop Online Is a Broken Experience (2018)

https://blog.raed.dev/posts/buying_laptop_online_broken_experience/
2•Raed667•28m ago•0 comments

We Built a Linux Kernel Mailing List Front End

https://nexus-kb.com/blog/nexus-kb-announcement/
3•tansanrao•30m ago•0 comments

Ask HN: Developers still enjoying development after AI?

4•aavci•30m ago•4 comments

The Internet Has 100M Shops and No Front Door

https://askucp.com/blog
2•possiblelion•30m ago•0 comments

OmniCode: A Benchmark for Evaluating Software Development Agents

https://arxiv.org/abs/2602.02262
2•foma-roje•32m ago•0 comments

From IDEs to AI Agents with Steve Yegge [video]

https://www.youtube.com/watch?v=aFsAOu2bgFk
2•claudiug•32m ago•1 comments

Show HN: Daub – A rendering spec for AI-generated UIs (two files, no build step)

https://daub.dev
2•kulesh•34m ago•0 comments