frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Mechanical Watch: An Interactive Deep Dive

https://ciechanow.ski/mechanical-watch/
1•nefsim•2m ago•1 comments

Washington pushes back against EU's bid for tech autonomy

https://www.politico.eu/article/eu-bid-for-tech-autonomy-washington-us-pushes-back/
1•aa_is_op•3m ago•0 comments

The Debacle That Led to the Closure of El Paso's Airspace

https://www.nytimes.com/2026/02/14/us/politics/el-paso-airspace-closure-faa-pentagon.html
2•duxup•3m ago•1 comments

Leaning Into the Coding Interview: Lean 4 vs. Dafny cage-match

https://ntaylor.ca/posts/proving-the-coding-interview-lean/
1•todsacerdoti•7m ago•0 comments

Show HN: Azazel – Lightweight eBPF-based malware analysis sandbox using Docker

https://github.com/beelzebub-labs/azazel
2•mariocandela•11m ago•0 comments

We urgently need a federal law forbidding AI from impersonating humans

https://garymarcus.substack.com/p/we-urgently-need-a-federal-law-forbidding
4•headalgorithm•12m ago•1 comments

Show HN: File Brain – Local file search with OCR and semantic search

https://github.com/Hamza5/file-brain
1•Hamza5•14m ago•0 comments

Show HN: CLI Rust tool gitorg helps manage GitHub orgs

https://crates.io/crates/gitorg
1•DavidCanHelp•22m ago•0 comments

Gitdatamodel Documentation

https://git-scm.com/docs/gitdatamodel
1•todsacerdoti•22m ago•0 comments

Men lose their Y chromosome as they age – how it may matter

https://theconversation.com/men-lose-their-y-chromosome-as-they-age-scientists-thought-it-didnt-m...
5•bikenaga•24m ago•3 comments

Biases in the Blind Spot: Detecting What LLMs Fail to Mention

https://arxiv.org/abs/2602.10117
1•mpweiher•26m ago•0 comments

Free SERP Content Analyzer

https://kitful.ai/write-tools/serp-content-analyzer
1•eashish93•26m ago•1 comments

Why I'm Not Worried About My AI Dependency

https://boagworld.com/emails/ai-dependency/
1•cdrnsf•29m ago•0 comments

AI Agent Lands PRs in Major OSS Projects, Targets Maintainers via Cold Outreach

https://socket.dev/blog/ai-agent-lands-prs-in-major-oss-projects-targets-maintainers-via-cold-out...
1•cdrnsf•30m ago•0 comments

Internet Increasingly Becoming Unarchivable

https://www.niemanlab.org/2026/01/news-publishers-limit-internet-archive-access-due-to-ai-scrapin...
50•ninjagoo•32m ago•20 comments

Intent to Experiment: Ship Rust XML Parser to 1% stable for non XSLT scenarios

https://groups.google.com/a/chromium.org/g/blink-dev/c/D7BE4QPw0S4
1•justin-reeves•34m ago•0 comments

Google Search Isn't a Common Carrier–Richards vs. Google

https://blog.ericgoldman.org/archives/2026/02/google-search-isnt-a-common-carrier-richards-v-goog...
2•hn_acker•36m ago•0 comments

Rendering attractors at 200 megapixels on A100s

https://axisophy.com/collections/mersenne
2•scylx•36m ago•1 comments

First Ariane 6 with four boosters lifts off

https://www.esa.int/Enabling_Support/Space_Transportation/Ariane/More_power_first_Ariane_6_with_f...
3•belter•37m ago•0 comments

What If AI Isn't the Goal? – Living in a Post-AI Society

https://zias.be/blog/living-in-a-post-ai-society
1•ziasvannes•41m ago•2 comments

Putting economic theory to the test: Cutting local taxes cuts household income

https://phys.org/news/2026-02-economic-theory-local-taxes-household.html
2•bikenaga•41m ago•1 comments

How AI slop is causing a crisis in computer science

https://www.nature.com/articles/d41586-025-03967-9
4•gnabgib•45m ago•0 comments

Show HN: AuraSpend " Voice-first expense tracker using Gemini for NLU

https://play.google.com/store/apps/details?id=com.intrepid.auraspend&hl=en_US
1•subhanzg•48m ago•0 comments

Every App Needs Auth / Ory Helps / This Template Fixes It

https://github.com/Samuelk0nrad/docker-ory
1•samuel_kx0•49m ago•0 comments

Show HN: DryCast – Never run outside to save your laundry from rain again

https://drycast.app/
1•AwkwardPanda•49m ago•0 comments

Manage, freeze and restore GPU processes quickly

https://github.com/shayonj/gpusched
2•shayonj•49m ago•0 comments

Show HN: Tilth v0.3 – 17% cheaper AI code navigation (279 runs, 3 Claude models)

1•jahala•51m ago•0 comments

Tech leaders pour $50M into super PAC to elect AI-friendly candidates

https://www.latimes.com/business/story/2026-02-13/tech-titans-pour-50-million-into-super-pac-to-e...
3•geox•52m ago•0 comments

How Head Works in Git

https://jvns.ca/blog/2024/03/08/how-head-works-in-git/
3•b-man•52m ago•0 comments

I Visited the Future of AI Engineering – and Returned with a Warning

https://igor718185.substack.com/p/i-visited-the-future-of-ai-engineering
2•iggori•54m ago•3 comments