frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•6mo ago

Comments

tocs3•6mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

GlobalBuildingAtlas (3D Models of 2.8B Buildings in the World)

https://tubvsig-so2sat-vm1.srv.mwn.de
1•jotaen•1m ago•1 comments

Standards Queues

https://bkardell.com/blog/Queues.html
1•speckx•1m ago•0 comments

The "confident idiot" problem: Why AI needs hard rules, not vibe checks

https://steerlabs.substack.com/p/confident-idiot-problem
1•steerlabs•2m ago•0 comments

Micron stops selling memory to consumers as demand spikes from AI chips

https://www.cnbc.com/2025/12/03/micron-stops-selling-memory-to-consumers-demand-spikes-from-ai-ch...
2•1vuio0pswjnm7•6m ago•0 comments

Sine wave cube in less than 1K

https://raurir.com/posts/js1k-cube-sine-sdf/
2•rauri•6m ago•0 comments

Bio-essential sugars in samples from asteroid Bennu

https://www.nature.com/articles/s41561-025-01838-6
1•Luc•7m ago•0 comments

Researchers find what makes AI chatbots politically persuasive

https://arstechnica.com/science/2025/12/researchers-find-what-makes-ai-chatbots-politically-persu...
1•furcyd•7m ago•0 comments

[ Hello Blog

https://nobloat.org/articles/2025-07-01-hello-blog.html
2•cinemast•7m ago•0 comments

Show HN: Cheap OpenTelemetry lakehouses with Parquet, DuckDB, and Iceberg

https://clay.fyi/blog/cheap-opentelemetry-lakehouses-parquet-duckdb-iceberg/
2•smithclay•8m ago•0 comments

Random.org

https://www.random.org/
2•bookofjoe•10m ago•0 comments

MCP Web Host

https://mcphost.link/
1•init0•10m ago•1 comments

Your Dorky Spatial Database is My Magic Answer Machine [video]

https://www.youtube.com/watch?v=kNFyLdNzvGg
1•mooreds•12m ago•0 comments

GitHub Wrapped

https://www.trygitwrap.com/
1•nailer•14m ago•0 comments

Show HN: Dinotool – a foundation model vector embedding CLI

https://github.com/mikkoim/dinotool
1•mikkoim•16m ago•0 comments

Show HN: When TOON isnt enough, you need to GOON

https://github.com/GOON-format/goon
1•productiongrad•16m ago•0 comments

Undefined behaviour spotted in safe Rust code

https://twitter.com/BrianOrwe/status/1996624161209569696
1•quelsolaar•17m ago•0 comments

Meta Weighs Cuts to Its Metaverse Unit

https://www.nytimes.com/2025/12/04/technology/meta-weighs-cuts-to-its-metaverse-unit.html
2•donohoe•17m ago•0 comments

The Kenyan workers training China's AI models

https://restofworld.org/2025/kenya-china-ai-workers/
2•poisonborz•18m ago•0 comments

Apple's attempt at system-wide filtering API – is it good? AdGuard's research

https://adguard.com/en/blog/apple-url-filter-system-wide-filtering-api.html
2•quyleanh•19m ago•0 comments

Understand and fix issues with Phoenix.Socket origin checks

https://revelry.co/insights/development/elixir/phoenix-socket-check-origin/
2•grossvogel•23m ago•1 comments

Bitbucket: "Announcing powerful upgrades and a new pricing model..."

https://www.atlassian.com/blog/bitbucket/announcing-v5-self-hosted-runners
2•wcv•26m ago•1 comments

PowerSync: SQLite based local-first sync engine

https://www.powersync.com
3•nnnnico•27m ago•0 comments

Bcachefs 1.33.0 – Reconcile

https://lore.kernel.org/linux-bcachefs/slvis5ybvo7ch3vxh5yb6turapyq7hai2tddwjriicfxqivnpn@xdpb25w...
2•todsacerdoti•29m ago•0 comments

I Loved 'SQL Noir', but I Wanted to Fix the Learning Curve. So I Built This

https://sqlcasefiles.com/
2•hackstarky•30m ago•1 comments

Netherlands, Spain, Ireland and Slovenia Boycott Eurovision After Israel Allowed

https://www.bbc.com/news/live/ce3xrywzpn6t
3•cramsession•30m ago•0 comments

Growth Marketing Manager

https://www.utahtechlabs.com/
1•prvaughan•30m ago•1 comments

Fermi estimate comparing human sensory bandwidth to LLM input bandwidth

https://sdeture.substack.com/p/comparing-human-sensory-bandwidth
2•algae_rhythm•33m ago•0 comments

J6 Pipe Bomb Suspect Arrested

https://www.cbsnews.com/news/pipe-bomb-suspect-arrest/
2•stack_framer•33m ago•2 comments

Advancing Microsoft 365 Government: New Capabilities and Pricing Update

https://techcommunity.microsoft.com/blog/publicsectorblog/advancing-microsoft-365-government-new-...
1•TechTechTech•33m ago•0 comments

Fungal compound for treating brain cancer synthesized

https://news.mit.edu/2025/mit-chemists-synthesize-fungal-compound-holds-promise-treating-brain-ca...
1•gmays•34m ago•1 comments