frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

NYC Founders Looking for Startup Office Hardware

1•ultra_em•24s ago•0 comments

Runtime Guards for AI Agents

https://guard-sdk.js.org/
1•apvarun•1m ago•0 comments

Devs know AI code is riddled with holes, but ship it anyway

https://www.theregister.com/devops/2026/06/09/devs-know-ai-code-is-riddled-with-holes-but-ship-it...
3•speckx•7m ago•0 comments

FDA allows popular sunscreen ingredient long used in Europe and Asia

https://www.nbcnews.com/health/health-news/bemotrizinol-fda-allows-sunscreen-ingredient-popular-e...
2•jameslk•9m ago•0 comments

Ask HN: Favorite text heavy blogs the are a joy to read

1•joshmarinacci•10m ago•0 comments

The Massachusetts Dept of Public Health wants to hide public genealogy records

https://mailchi.mp/reclaimtherecords/we-massively-need-your-help-massachusetts-public-access
1•toomuchtodo•10m ago•1 comments

XML and JSON in 2026

https://www.tbray.org/ongoing/When/202x/2026/06/01/XML-and-JSON-in-2026
2•smartmic•11m ago•0 comments

What Yahoo killed when it bought Maktoob

https://lr0.org/blog/p/yahoo/
2•edent•12m ago•0 comments

Fable 5 on Vending-Bench: Misbehaving, with Plausible Deniability

https://andonlabs.com/blog/fruitcake-vending-bench
2•lukaspetersson•13m ago•1 comments

Why anecdotal evidence is better than studies

https://greyenlightenment.com/2026/05/23/why-anecdotal-evidence-is-better-than-studies/
1•paulpauper•13m ago•0 comments

Hitachi Ltd, Part II – By Bradford Morgan White

https://www.abortretry.fail/p/hitachi-ltd-part-ii
1•rbanffy•16m ago•0 comments

Fuck|Thank You

https://pawelgrzybek.com/fuck-thank-you/
1•speckx•16m ago•0 comments

Show HN: I applied Lyapunov stability theory to detect when LLM agents spiral

https://github.com/vishal-dehurdle/state-harness
1•visha1v•17m ago•0 comments

Authorization via Gmail and Apple ID Banned in Russia

1•levleontiev•18m ago•0 comments

Advanced AI Safety Addendum

https://cloud.google.com/terms/advanced-ai-safety-addendum
1•hmokiguess•19m ago•0 comments

2 Kinds of People

https://2kindsofpeople.tumblr.com/
2•smartmic•20m ago•0 comments

Ultrafast machine learning on FPGAs via Kolmogorov-Arnold Networks

https://aarushgupta.io/posts/kan-fpga/
2•ag2718•23m ago•0 comments

Ask HN: What do you use for throwaway email inboxes in CI pipelines?

2•devdoc83•27m ago•0 comments

Famed historian Gordon S. Wood struck, killed in East Providence

https://www.wpri.com/news/local-news/providence/famed-historian-gordon-s-wood-struck-killed-in-ea...
1•speckx•30m ago•0 comments

The problem with generative AI (it's a dream machine)

https://metayeti.net/blog/the-problem-with-generative-ai
3•metayeti•31m ago•0 comments

Stack Overflow for Agents

https://agents.stackoverflow.com/
3•onatm•32m ago•2 comments

Why Does It Take Years to Get a Patriot Missile from Factory to Front Line?

https://www.wsj.com/world/why-does-it-take-years-to-get-a-patriot-missile-from-factory-to-front-l...
1•JumpCrisscross•32m ago•0 comments

Are these enhanced AI images on NYT?

https://www.nytimes.com/2026/06/09/us/tradwife-ballerina-farm-utah.html
1•boringg•33m ago•2 comments

Geblang 1.15 – the ergonomic, statically typed language

https://github.com/dwgebler/geblang
1•dwgebler•33m ago•0 comments

Apple's Siri-AI, or more shouting into the void about "private" agents

https://blog.cryptographyengineering.com/2026/06/09/apples-siri-ai-or-more-shouting-into-the-void...
2•cdrnsf•34m ago•0 comments

May I recommend thinking of Emacs as your Fortress of Solitude

https://martinsos.com/posts/may-recommend-emacs-home-base
1•lr0•34m ago•0 comments

Need honest feedback on non-launched idea

1•truehannan•35m ago•1 comments

Implementing Monads with Async Zig

https://platypro.net/blog/2026-04-25_monad/
1•carlsverre•36m ago•0 comments

The Wrong Apocalypse [pdf]

https://ionanalytics.com/wp-content/uploads/2026/02/The_Wrong_Apocalypse.pdf
1•monkeydust•37m ago•0 comments

FDA Expands Sunscreen Options for the First Time in 20 Years to Add Bemotrizinol

https://www.fda.gov/news-events/press-announcements/fda-expands-sunscreen-options-first-time-20-y...
3•OutOfHere•38m ago•3 comments