frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•12mo ago

Comments

tocs3•12mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: QuiteGPT – makes GPT response shorter

https://quiet-gpt.craftgarden.io/
1•leapoahead1•48s ago•0 comments

I made an AI Interior consultant

https://studio.mystofa.com/en-US
1•assorium•2m ago•0 comments

Bitcoin's Power Law: Weak Structure, Strong Forecasts

https://arxiv.org/abs/2605.21316
1•CarlosBaquero•2m ago•0 comments

Consciousness, Gödel, and the Boundary of the Box

https://twitter.com/VFD_org/status/2057053649315013042
1•__patchbit__•4m ago•0 comments

YAML? That's Norway Problem

https://lab174.com/blog/202601-yaml-norway/
1•theanonymousone•10m ago•0 comments

New on Platform

1•rockstaradi•12m ago•0 comments

Ye Olde RFC

https://github.com/gabinante/ye-olde-rfc
1•oooyay•12m ago•0 comments

Nostr-VPN Is one of the most useful things in open source

https://git.iris.to/#/npub1xdhnr9mrv47kkrn95k6cwecearydeh8e895990n3acntwvmgk2dsdeeycm/nostr-vpn
1•abhsag24•12m ago•0 comments

70% of Faculty Vote to Overhaul Harvard Grading with a Cap

https://www.thecrimson.com/article/2026/5/20/fas-passes-a-grade-cap/
1•k2enemy•13m ago•0 comments

List of Alleged Extraterrestrial Beings

https://en.wikipedia.org/wiki/List_of_alleged_extraterrestrial_beings
2•thunderbong•18m ago•0 comments

Software engineering is the tipping point [video]

https://www.youtube.com/watch?v=9t9Kj2f6wtU
1•azhenley•20m ago•0 comments

The Mysterious XF86AudioPlay Issue

https://michael-prokop.at/blog/2026/05/20/the-mysterious-xf86audioplay-issue/
2•JNRowe•25m ago•0 comments

VPNs: The "Most Trusted" Security Tool Until Claude Roasts It in a Weekend

https://www.hacktron.ai/blog/cve-2026-0265-panos-globalprotect-cas-auth-bypass
3•jordybg•26m ago•0 comments

Show HN: Macfigure – Mac configuration in pkl. Simple alternative to Nix-Darwin

https://github.com/Quintisimo/macfigure
1•quintisimo•26m ago•0 comments

Ask HN: Transition from Engineering Manager to IC

1•lavsv•27m ago•0 comments

The Secrets Revealed in SpaceX's IPO Filing

https://www.wsj.com/business/spacex-ipo-takeaways-cea33689
1•1vuio0pswjnm7•29m ago•0 comments

SpaceX Filing Reveals $4.28B Loss, Musk's Tight Grip (3)

https://news.bloomberglaw.com/capital-markets/musks-spacex-files-publicly-for-nasdaq-ipo-under-sy...
3•1vuio0pswjnm7•30m ago•0 comments

SpaceX IPO filing lays bare losses and Musk control as it stakes future on AI

https://www.reuters.com/legal/transactional/bound-mars-elon-musks-spacex-unveils-filing-blockbust...
3•1vuio0pswjnm7•31m ago•0 comments

Reconnecting. – – 5/5 why don't they fix codex

1•apoorvdarshan•33m ago•0 comments

Why is writing good test cases still painfully manual in 2026?

https://calendly.com/qualityfolio2026/30min
1•Daniel_Carter•33m ago•0 comments

Lazydiff: Terminal PR Review with AST-Aware Semantic Diff Rendering

https://github.com/Ataraxy-Labs/lazydiff
1•rohanucla•37m ago•0 comments

Is Python Becoming Pinyin?

https://lernerpython.com/2026/05/19/is-python-becoming-pinyin/
3•reuven•47m ago•0 comments

Rewrite System Showdown: Stochastic Search vs. EqSat

https://arxiv.org/abs/2605.19005
1•pcfwik•51m ago•0 comments

Scheme Interpreter

https://cs61a.org/proj/scheme/
1•HiPHInch•53m ago•0 comments

Understanding KV Cache: The Hidden Memory Cost of Serving LLMs

https://melchi.me/posts/kv-cache/
2•colescodes•55m ago•0 comments

Opn-Chat

https://opn-chat-v1-freebuff.vercel.app
1•fcapuz•56m ago•0 comments

Workers in India are training robots that may replace them

https://indianexpress.com/article/business/workers-india-training-robots-replace-factory-10698712/
3•methuselah_in•57m ago•0 comments

Why does the arrow (->) operator in C exist?

https://stackoverflow.com/questions/13366083/why-does-the-arrow-operator-in-c-exist
5•gnabgib•1h ago•0 comments

Trump admin didn't want Ebola-exposed Americans, sent them to Berlin, Prague

https://arstechnica.com/health/2026/05/trump-admin-didnt-want-ebola-exposed-americans-sent-them-t...
5•donutshop•1h ago•0 comments

Housekeeping Wages to Top $110K Year in New York City by Early 2030s

https://loyaltylobby.com/2026/05/20/housekeeping-wages-to-top-110k-year-in-new-york-city-by-early...
1•lxm•1h ago•0 comments