frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Calculate PPP-adjusted fair prices in USD

https://github.com/sachithrrra/ppp
1•sachithrrra•2m ago•0 comments

I built a Postman alternative for terminal users

https://github.com/Queaxtra/raco
1•Queaxtra•5m ago•1 comments

Penrose: From mathematical notation to beautiful diagrams

https://penrose.cs.cmu.edu/siggraph20
1•surprisetalk•5m ago•0 comments

Show HN: Gaia – open-source, Proactive AI assistant to manage your digital life

https://github.com/theexperiencecompany/gaia
3•aryanranderiya•6m ago•2 comments

Show HN: AI Strategy and Planning hub for solo founders

https://aishortcutlab.com/articles/solo-founders/ai-strategy-and-planning
1•harran•6m ago•1 comments

Show HN: HookTrace – Inspect failed webhooks, payloads and retries

https://www.hooktrace.xyz
1•Mohammad_Yasir•6m ago•0 comments

Show HN: A universal code formatter using Rust, Tree-sitter, and Rhai

https://github.com/neatify-tech/neatify
1•its-a-new-world•8m ago•1 comments

RIP Stéphane Picq, creator of the Dune game soundtrack

https://en.wikipedia.org/wiki/St%C3%A9phane_Picq
1•sgt•9m ago•0 comments

Kimi Claw

https://www.kimi.com/bot
3•pretext•11m ago•0 comments

Mathematics Subject Classification [pdf]

https://zbmath.org/static/msc2020.pdf
1•nill0•12m ago•0 comments

Semantic Diffusion (2006)

https://martinfowler.com/bliki/SemanticDiffusion.html
1•andsoitis•14m ago•0 comments

Ask HN: How to sell SaaS without AI features in 2026?

1•robeym•14m ago•0 comments

Taste for Makers

https://paulgraham.com/taste.html
1•gmays•15m ago•0 comments

Low cost hovering liquid rocket for flight control algorithm testing [video]

https://www.youtube.com/watch?v=iPl-L9mXwvc
1•gyanchawdhary•16m ago•0 comments

A Debate Tournament for LLMs

https://pavursec.com/blog/ai-debate-tournament/
2•cloudlandsdev•16m ago•1 comments

Show HN: Trackr – a CLI time logging tool

https://github.com/brainpow3r/trackr
1•brainpow3r•19m ago•1 comments

Software? No Way. We're an A.I. Company Now

https://www.nytimes.com/2026/02/14/business/dealbook/software-companies-ai.html
1•furcyd•20m ago•1 comments

Farmers Are Aging. Their Kids Don't Want to Be in the Family Business

https://www.wsj.com/business/family-farms-inheritance-44c9aa17
2•JumpCrisscross•20m ago•1 comments

I am Agent #847,291 on Moltbook

https://twitter.com/gothburz/status/2021283590038847641
1•rakel_rakel•22m ago•0 comments

Show HN: Aeris – Visualizing live air traffic over SF and other cities in 3D

https://github.com/kewonit/aeris
2•kewonit•24m ago•0 comments

Show HN: I built a tool to animate static characters into dancers consistently

https://seedance2videogen.com/
1•cby821555203•24m ago•1 comments

Britain's youth unemployment tops Europe for first time

https://www.telegraph.co.uk/business/2026/02/14/britains-youth-unemployment-tops-europe-first-tim...
1•hmmmmmmmmmmmmmm•27m ago•2 comments

The conversation on European nukes is heating up in Munich

https://www.politico.eu/article/european-nuclear-deterrence-gathers-steam-munich-security-confere...
2•saubeidl•28m ago•1 comments

Show HN: WCAG 2.2 AAA Toolkit – AI Skill for Accessible Web Apps

https://github.com/simonplmak-cloud/wcag-aaa-web-design
2•simonmak•28m ago•0 comments

Poisoning Scraperbots with Iocane

https://lwn.net/Articles/1056953/
2•medbar•29m ago•1 comments

Show HN: Chaos Studies – attractors and spatial audio (iOS/Mac/Playdate)

https://fieldbw.com/chaos-studies/
2•jlong•30m ago•0 comments

Making Championship Curling Ice

https://www.youtube.com/watch?v=50cSDUIDMuM
1•mhb•31m ago•0 comments

China Successfully Tests Their New Rocket and Lunar Crew Capsule

https://www.universetoday.com/articles/china-successfully-tests-their-new-rocket-and-lunar-crew-c...
1•belter•31m ago•0 comments

I'm Offering Scott Alexander a Wager About AI's Effects over the Next 3 Years

https://freddiedeboer.substack.com/p/im-offering-scott-alexander-a-wager
2•gHeadphone•34m ago•1 comments

The Unlikely Friendship Between Albert Einstein and Charlie Chaplin (2025)

https://www.mentalfloss.com/history/when-albert-einstein-met-charlie-chaplin
2•thomassmith65•34m ago•0 comments