frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•11mo ago

Comments

tocs3•11mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Economic growth is a power law

https://julienreszka.github.io/economic-simulator/armey-curve.html
1•julienreszka•2m ago•0 comments

Why C Remains the Gold Standard for Cryptographic Software

https://www.wolfssl.com/why-c-remains-the-gold-standard-for-cryptographic-software/
2•LinuxJedi•4m ago•0 comments

40 Years Ago, a Nuclear Catastrophe at Chernobyl

https://www.nytimes.com/2026/04/26/world/europe/40-years-ago-a-nuclear-catastrophe-at-chernobyl.html
1•HelloUsername•5m ago•0 comments

Codex MSN Interface

https://codexmessenger.net/
1•blef•10m ago•0 comments

Headless websites and the cost of engineering vanity

https://www.jonoalderson.com/conjecture/headless-websites/
1•misone•11m ago•0 comments

Quick tutorial to get a blog online from Org Mode thanks to Org Social

https://en.andros.dev/blog/c68f00c3/quick-tutorial-to-get-a-blog-online-from-org-mode-thanks-to-o...
1•andros•12m ago•0 comments

APL is more French than English

https://www.jsoftware.com/papers/perlis78.htm
1•tosh•13m ago•0 comments

The Knight Programming Language

https://github.com/knight-lang/knight-lang/tree/master
1•tosh•15m ago•0 comments

Exposing Floating Point – Bartosz Ciechanowski

https://ciechanow.ski/exposing-floating-point/
2•subset•18m ago•0 comments

Seven database engines in a single Rust binary

https://github.com/nodeDB-Lab/nodedb
1•mansarip•22m ago•0 comments

Tip: Web requests should not be measured in Hz [Hertz]

https://mastodon.catgirl.cloud/@sophie/116467789133733136
1•robin_reala•24m ago•0 comments

Self-Updating Screenshots

https://interblah.net/self-updating-screenshots
1•bjhess•36m ago•0 comments

Open grid data has a public benefit

https://nworbmot.org/blog/open-grid-data.html
1•lyoncy•37m ago•0 comments

Airprompt – SSH into your Mac from your phone for AI agent prompts

https://www.npmjs.com/package/airprompt
2•hatefrad•39m ago•1 comments

Show HN: A community powered global network of probes

https://github.com/jsdelivr/globalping
1•jimaek•41m ago•0 comments

The Scrum-to-POM Transition Is a Role Repositioning Event

https://age-of-product.com/scrum-to-pom-transition/
1•swolpers•43m ago•0 comments

Pytest-cloudreport – local HTML reports and flaky-test detection for pytest

https://github.com/ahmad212o/pytest-cloudreport
1•ahmad212o•44m ago•0 comments

Blueprint: AI Hardware Design

https://www.blueprint.am/
1•handfuloflight•48m ago•0 comments

US is making Europe pay dearly for its half-hearted electrification

https://www.programmablemutter.com/cp/195461224
2•hackandthink•49m ago•0 comments

The reporters at this news site are AI bots. OpenAI's super PAC is funding it

https://twitter.com/TheMidasProj/status/2047692328396034490
1•pretext•54m ago•0 comments

San Francisco must preserve the birthplace of the Mission burrito

https://www.sfchronicle.com/food/restaurants/article/el-faro-mission-burrito-creator-22206173.php
3•divbzero•54m ago•0 comments

Enterprises Are Rethinking Kubernetes

https://www.infoworld.com/article/4161056/enterprises-are-rethinking-kubernetes.html
3•milkglass•57m ago•0 comments

Talk a stranger for fun or everything else

https://bakbak.fun/
3•chintan39•1h ago•1 comments

The West Forgot How to Make Things. Now It's Forgetting How to Code

https://techtrenches.dev/p/the-west-forgot-how-to-make-things
82•milkglass•1h ago•28 comments

The Coding Assistant Breakdown: More Tokens Please

https://newsletter.semianalysis.com/p/the-coding-assistant-breakdown-more
1•gmays•1h ago•0 comments

WTF Are Metaballs?

https://www.youtube.com/watch?v=LW03EEKjy9o
2•gdubs•1h ago•3 comments

Iran war hits Dubai chocolate pistachio supplies

https://www.ft.com/content/438ef32a-59e5-41b3-a0da-569716385347
1•KnuthIsGod•1h ago•0 comments

CO operating system age-verification open-source exemption doesn't include Linux

https://twitter.com/LundukeJournal/status/2048199650117554678
6•gasull•1h ago•0 comments

Why Rome Never Industrialized [video]

https://www.youtube.com/watch?v=uR8-AF6NJcc
2•Khaine•1h ago•1 comments

A Taiwanese Vestige in the Geedge Supply Chain

https://interseclab.org/research/madlink-a-taiwanese-vestige-in-the-geedge-supply-chain/
3•gslin•1h ago•0 comments