frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Archaeologists Find Egyptian Mummy Buried with the 'Iliad'

https://www.nytimes.com/2026/05/15/science/archaeology-egypt-mummy-iliad.html
1•diodorus•1m ago•0 comments

A Letter from 2040

https://www.xydac.com/blog/2040-now/
3•xydac•8m ago•0 comments

How to Write to SSDs [pdf]

https://www.vldb.org/pvldb/vol19/p1469-lee.pdf
1•matt_d•9m ago•0 comments

Show HN: A seed prompt that bootstraps a custom knowledge-base system

https://github.com/dah/llm-seedlab
1•dnh44•9m ago•0 comments

Getty Awards $1.8M to Increase Access to Black Visual Arts Archives

https://www.getty.edu/news/getty-award-to-increase-access-to-black-visual-arts-archives/
1•ohjeez•16m ago•0 comments

Show HN: VisiSign – $0.10 per envelope e-signatures with no monthly fee

https://visisign.app/
1•rdoneill•16m ago•0 comments

Construction on Meta's largest data center brings chaos to rural Louisiana

https://lailluminator.com/2025/11/22/meta-data-center-crashes/
2•cdrnsf•16m ago•0 comments

Bootstrapping a SQL catalog on a flat key-value store

https://n8z.dev/posts/keys-and-values-are-all-you-need/
1•nlz•18m ago•1 comments

CVE-2026-46333 (SSH-keygen-pwn)

https://nvd.nist.gov/vuln/detail/CVE-2026-46333
1•ethanplant•21m ago•0 comments

After the Feed

https://blockchaincapital.com/blog/after-the-feed
1•doener•23m ago•0 comments

Tech Companies to Discuss Iran's Future During 'Private Conference' at Uber HQ

https://www.404media.co/tech-companies-to-discuss-irans-future-during-private-conference-at-uber-hq/
4•cdrnsf•23m ago•0 comments

Silicate-derived calcium as a pathway to low-carbon Portland cement

https://www.nature.com/articles/s44458-026-00056-4
2•bribroder•24m ago•0 comments

Show HN: Open modular tracking stack for VR/MR headsets (eye, SLAM, FBT, BCI) [video]

https://www.youtube.com/watch?v=QlfCfkzkBB4
1•WalkerDev•24m ago•0 comments

Different models solve number-theory race problem

https://aicc.rayonnant.ai/challenges/palin-prime-bits/
1•yogthos•27m ago•0 comments

Canada Says Critics Don't Understand Its Surveillance Bill

https://reclaimthenet.org/canada-says-critics-dont-understand-its-surveillance-bill
4•Cider9986•28m ago•1 comments

What Value Do You Provide?

https://ethancedwards.com/blog/what-value-do-you-provide
2•ethancedwards8•31m ago•0 comments

Be Weird – Doing the opposite is now a strategy

https://tinyempires.substack.com/p/be-weird
1•fallinditch•35m ago•0 comments

Home of the Underdogs [in 2026]

https://homeoftheunderdogs.net/
1•DanielleMolloy•35m ago•0 comments

When Knowledge Is Cheap, Insight Is Everything: Jevons Paradox

https://twitter.com/ZoharAtkins/status/2054168204658815070
1•myth_drannon•36m ago•0 comments

Rust on My Bun

https://renfoc.us/posts/1778877814-rust_on_my_bun
3•pjmlp•36m ago•0 comments

Why surveillance pricing bans are suddenly gaining traction this year

https://calmatters.org/economy/technology/2026/05/why-surveillance-pricing-bans-are-suddenly-gain...
1•cdrnsf•40m ago•0 comments

Ask HN: Conductor vs. native Claude Code. Same single-agent performance?

1•nilen•40m ago•0 comments

Wikipedia: Writing articles with LLMs

https://en.wikipedia.org/wiki/Wikipedia:Writing_articles_with_large_language_models
2•reconnecting•41m ago•0 comments

Google says generative AI visibility is still SEO

https://developers.google.com/search/docs/fundamentals/ai-optimization-guide
2•snoren•41m ago•0 comments

Waymo driverless cars become trapped in Atlanta suburb after glitch

https://www.bbc.com/news/videos/czx20g00ly1o
3•berkeleyjunk•42m ago•0 comments

AlexNet Source Code

https://github.com/computerhistory/AlexNet-Source-Code
1•RyanShook•45m ago•0 comments

HN: AllTime – AI calendar that replaces 5 apps with one

https://apps.apple.com/us/app/alltime-ai-daily-planner/id6759578102
1•deecarrera•48m ago•1 comments

HWE Bench: A new unbounded Benchmark for LLMs (GPT 5.5 is on top)

https://hwebench.com/
3•fesens•49m ago•2 comments

Which (De-Googled) OS(s) are you using on mobile?

https://discuss.privacyguides.net/t/which-de-googled-os-s-are-you-using-on-mobile/23904
2•Cider9986•49m ago•0 comments

Sonoeazy – Validating a short-form audio platform. Experiment 1: "I love you"

https://sonoeazy.com/
1•genericone•53m ago•1 comments