frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Half of social-science studies fail replication test in years-long project

https://www.nature.com/articles/d41586-026-00955-5
1•MrBuddyCasino•3m ago•0 comments

Eli Lilly's obesity pill approved by FDA, setting up Novo Nordisk competition

https://www.statnews.com/2026/04/01/eli-lilly-obesity-pill-approved-orforglipron-foundayo/
1•andsoitis•5m ago•0 comments

Ask HN: Has anyone became successful on their own?

2•Nair0•12m ago•1 comments

100 Prisoners Problem

https://en.wikipedia.org/wiki/100_prisoners_problem
1•djoldman•13m ago•0 comments

UK SATS Exam Papers

https://www.satspapers.org.uk/Page.aspx?TId=5
1•alt227•13m ago•0 comments

Block – From Hierarchy to Intelligence

https://block.xyz/inside/from-hierarchy-to-intelligence
1•abdelhousni•13m ago•1 comments

Data Science Weekly – Issue 645

https://datascienceweekly.substack.com/p/data-science-weekly-issue-645
1•sebg•15m ago•0 comments

Army approves M111, first new lethal hand grenade since 1968

https://www.army.mil/article/290962/army_approves_m111_first_new_lethal_hand_grenade_since_1968
1•campuscodi•15m ago•0 comments

Coruna: The Mysterious Journey of a Powerful iOS Exploit Kit

https://cloud.google.com/blog/topics/threat-intelligence/coruna-powerful-ios-exploit-kit
1•abhisek•17m ago•0 comments

DMCA-resistant Claude Code source code

https://codeberg.org/tornikeo/claude-code
1•tornikeo•17m ago•1 comments

Chinese chipmakers claim nearly half of local market as Nvidia's lead shrinks

https://www.reuters.com/world/china/chinese-chipmakers-claim-nearly-half-of-local-market-nvidias-...
1•qwikhost•23m ago•0 comments

Baby's Second Garbage Collector

https://www.matheusmoreira.com/articles/babys-second-garbage-collector
1•matheusmoreira•26m ago•0 comments

Show HN: I built a DNS resolver from scratch in Rust – no DNS libraries

https://github.com/razvandimescu/numa
4•rdme•29m ago•3 comments

High‑Performance JavaScript Data Grid for Data Apps

https://blog.webix.com/javascript-data-grid-webix-review/
1•jswebdev•33m ago•0 comments

Please stop flagging everything going against Israel

5•throwaw12•34m ago•3 comments

Almighty Lisp: Lisp and Emacs Essentials Book

https://almightylisp.com/
1•nemoniac•35m ago•0 comments

We built Postgres compatibility for our database and made it reusable libraries

https://greptime.com/blogs/2026-04-01-greptimedb-postgresql-compatibility
2•sunng•36m ago•0 comments

Most Claude Code advice is measurably wrong

https://old.reddit.com/r/ClaudeAI/comments/1s8mbqm/i_read_17_papers_on_agentic_ai_workflows_most/
2•DeathArrow•36m ago•0 comments

Article about simple LSB steganography in JavaScript

https://www.yourdev.net/blog.php?post=steganography-hiding-data-in-images
1•ernos•44m ago•1 comments

Desktop pet companion built from Claude Code's leaked /buddy system

https://github.com/StartripAI/buddyClaw
1•AlfredHua1•44m ago•0 comments

Show HN: Osint of I-80 for EV site selection. Finding 10MW spots

https://airtable.com/appqWMTNS1Sz0mmeM/shrs17SlKRpvBvLOa
1•shegby•46m ago•0 comments

The ∞-Oreo

https://arxiv.org/abs/2604.00435
3•nill0•53m ago•0 comments

Men are ditching TV for YouTube as AI usage and social media fatigue grow

https://www.ofcom.org.uk/media-use-and-attitudes/media-habits-adults/passive-social-media-use-ai-...
1•bundie•54m ago•0 comments

China became a global pharmaceutical powerhouse

https://cepr.org/voxeu/columns/free-rider-innovator-how-china-became-global-pharmaceutical-powerh...
1•hunglee2•56m ago•0 comments

Solar saved Europe €3B in fossil fuel imports in March

https://www.euronews.com/2026/04/01/solar-saved-europe-3bn-in-fossil-fuel-imports-in-march-which-...
2•vrganj•57m ago•0 comments

PowerChest: macOS app for people who miss old school Powertoys

https://powerchest.app
1•baebeegeezus•57m ago•0 comments

WebGPU Bench

https://sylwia-lask.github.io/webgpu-bench/
1•tosh•57m ago•1 comments

Tell HN: Who Is Hiring Since 2016, Trend is evolving

1•throwaw12•1h ago•1 comments

I Am Not A Number. In memory of the more than 72,000 Palestinians killed

https://bkhmsi.github.io/i-am-not-a-number/
214•bjourne•1h ago•41 comments

Show HN: A drop-in replacement and memory-safe TLS back end for Python

https://github.com/jawah/rtls
1•mesahm•1h ago•1 comments