frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Operation Bluebird wants to relaunch "Twitter," says Musk abandoned the name

https://arstechnica.com/information-technology/2025/12/can-twitter-fly-again-startup-wants-to-pry...
1•throw0101a•2m ago•0 comments

FMDQ: Bonds erase N2.53T in two days as yields spike on supply shock

https://nairametrics.com/2025/12/11/fmdq-bonds-erase-n2-53-trillion-in-two-days-as-yields-spike-o...
2•kckkmgboji•8m ago•1 comments

Show HN: NextUnicorn – Swipe to validate SaaS ideas before building them

https://nextunicorn.app
1•killersheep•10m ago•0 comments

Meta shuts down global accounts linked to abortion advice and queer content

https://www.theguardian.com/global-development/2025/dec/11/meta-shuts-down-global-accounts-linked...
1•ta988•13m ago•0 comments

Chinese Vessels Near China switched off AIs overnight

https://twitter.com/nguyenho1096438/status/1998730737042792717
1•phantomathkg•13m ago•0 comments

Email Apnea

https://en.wikipedia.org/wiki/Email_apnea
1•amadeuspagel•16m ago•0 comments

Show HN: Pfff – Turn daily frustrations into XP with witty AI responses

https://pfff.me
1•killersheep•20m ago•0 comments

Show HN: LocalDrop – Private, client-side HEIC converter (Next.js and WASM)

https://localdrop.jaid.dev
1•ntempus•21m ago•0 comments

YouTubers Are Often Overestimating AI (Internet of Bugs)

https://www.youtube.com/watch?v=4lKyNdZz3Vw
1•minraws•21m ago•0 comments

The Air Quality Monitoring Myths That Mislead Users

https://www.airgradient.com/blog/air-quality-monitoring-myths/
1•ahaucnx•27m ago•0 comments

Other People Might Just Not Have Your Problems

https://thingofthings.substack.com/p/other-people-might-just-not-have
2•barry-cotter•27m ago•0 comments

Productivity App with Conversational Capabilities

https://dashzz.com
2•testarosar•30m ago•0 comments

Escope – powerful CLI tool for Elasticsearch cluster diagnostics and monitoring

https://github.com/mertbahardogan/escope
1•erayarslan•31m ago•0 comments

Earliest botanical art hints at prehistoric mathematical thinking

https://phys.org/news/2025-12-earliest-botanical-art-hints-prehistoric.html
2•stOneskull•39m ago•0 comments

How to Build a Life, By The Atlantic – 46 articles in a single chat Notebook

https://notebooklm.google.com/notebook/750a23df-fd98-4954-b9c4-71f16c3ee937
1•instagraham•39m ago•0 comments

Quantum computers learn how to simulate quarks

https://uwaterloo.ca/news/quantum-computers-learn-how-simulate-quarks
1•stOneskull•44m ago•0 comments

The first distributed real-time search analytics database

https://www.serenedb.com/
1•doener•44m ago•0 comments

Show HN: Built a tool for devs to create high-quality app icons

https://iconcraft.app/
1•sachinmotwani02•46m ago•3 comments

McDonald's removes AI-generated ad after backlash

https://www.theguardian.com/business/2025/dec/11/mcdonalds-removes-ai-generated-christmas-ad-adve...
10•terabytest•46m ago•19 comments

Early stage VC firm FoodLabs raises third fund of €105M

https://sifted.eu/articles/food-labs-raises-third-fund
1•doener•53m ago•0 comments

Show HN: Mapibara – A Map for Local Events (markets, concerts, hiking...)

https://mapibara.com
2•csantini•54m ago•1 comments

Shield protecting Chernobyl nuclear power plant no longer blocks radiation

https://www.euronews.com/2025/12/08/radiation-shield-protecting-chernobyl-nuclear-power-plant-no-...
2•mohi-kalantari•56m ago•0 comments

Roman urbanism was bad for health, new study confirms

https://phys.org/news/2025-12-roman-urbanism-bad-health.html
1•pseudolus•57m ago•0 comments

US forces take over sanctioned, stateless VLCC off Venezuela

https://www.lloydslist.com/LL1155836/Weve-just-seized-a-tanker-US-forces-take-over-sanctioned-sta...
2•monerozcash•57m ago•0 comments

$1 That Doubles Everyday

https://twitter.com/magnushambleton/status/1995459412463403045
1•barry-cotter•58m ago•0 comments

Prompt injection is not SQL injection (it may be worse)

https://www.ncsc.gov.uk/blog-post/prompt-injection-is-not-sql-injection
1•giuliomagnifico•58m ago•0 comments

Ask HN: Can I get feedback on this product

https://mu.xyz
2•asim•59m ago•1 comments

I got 50 high-profile angel investors to join our seed round

https://www.mentava.com/blog/how-i-got-50-high-profile-angel-investors-to-join-our-seed-round
1•barry-cotter•1h ago•0 comments

If You Quit Social Media, Will You Read More Books?

https://www.newyorker.com/news/fault-lines/if-you-quit-social-media-will-you-read-more-books
4•pseudolus•1h ago•3 comments

Show HN: Titan – JavaScript-first framework that compiles into a Rust server

https://www.npmjs.com/package/@ezetgalaxy/titan
1•soham_byte•1h ago•0 comments