frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Why Regulatory Scrutiny of AI Becomes Inevitable

https://www.aivojournal.org/why-regulatory-scrutiny-of-ai-becomes-inevitable/
1•businessmate•2m ago•1 comments

The Case for Upending World Trade

https://www.foreignaffairs.com/united-states/case-upending-world-trade
1•TMWNN•5m ago•0 comments

The Long History of Technologically Assisted Writing

https://lithub.com/inside-the-long-history-of-technologically-assisted-writing/
1•benbreen•8m ago•0 comments

First Contact with America

https://novum.substack.com/p/first-contact-with-america
1•paulpauper•12m ago•0 comments

Claude's Constitutional Structure

https://thezvi.substack.com/p/claudes-constitutional-structure
1•paulpauper•12m ago•0 comments

Tim Cook Attends Screening of Propaganda for Authoritarian's Wife

https://pxlnv.com/linklog/cook-melania/
3•HotGarbage•13m ago•0 comments

Apple 'Honours' Martin Luther King While Working Against His Legacy

https://pxlnv.com/linklog/apple-mlk/
1•HotGarbage•14m ago•0 comments

Show HN: An Internationalization GitHub Action to Replace Crowdin with LLMs

https://github.com/i18n-actions/ai-i18n
1•cport1•18m ago•1 comments

Quaternion Algebras

https://jvoight.github.io/quat.html
1•teleforce•21m ago•0 comments

Show HN: Drum machine VST made with React/C++

https://okaysynthesizer.com
1•tabakd•23m ago•0 comments

Why IDE-Level AI Rules Will Always Lose to Model-Level Capabilities

https://lellansin.github.io/2026/01/27/Why-Cursor-Rules-Failed-and-Claude-Skill-Succeeded/
1•lellansin•23m ago•1 comments

Benchmarking the JDBC Bottleneck in Trino

https://www.starburst.io/blog/benchmarking-the-jdbc-bottleneck-in-trino/
1•abadid•24m ago•1 comments

Reactive Custom Web Scratchpad

https://qeditor.dev/
1•dpweb•29m ago•0 comments

Tell HN: Ask your AI, "What can you tell me that you know about me?" to see ...

2•marcuswestin•31m ago•0 comments

Show HN: A Local OS for LLMs. MIT License. Zero Hallucinations. (Not Crank)

https://github.com/merchantmoh-debug/Remember-Me-AI
2•MohskiBroskiAI•31m ago•0 comments

Japan's Debt Bomb Is About to Explode and Hit the US [video]

https://www.youtube.com/watch?v=CGoYHWT-0Tk
2•thelastgallon•35m ago•0 comments

Show HN: Soka – The Golden Rule Licensed Under the GNU General Public License

https://github.com/bfinan/soka
2•brendanfinan•38m ago•0 comments

Show HN: Bingsan – Apache Iceberg REST Catalog in Go (24k rps, multi-node)

https://teampaprika.github.io/bingsan/en/
1•youngbum•40m ago•0 comments

Ask HN: How do you maintain your health while coding?

2•anirudhviswa•42m ago•0 comments

January Ice

https://fnhipster.com/posts/january-ice
1•fnhipster•42m ago•0 comments

Last Call for Mass Market Paperbacks

https://www.publishersweekly.com/pw/by-topic/industry-news/publisher-news/article/99293-last-call...
5•wombatpm•55m ago•1 comments

Letter on Modern Counterculture

https://johnhiggs.substack.com/p/new-moon-letter-1
1•dom2•56m ago•0 comments

Check for AI Search Optimization on Your Webpage

https://old.reddit.com/r/ActorReviews/comments/1qo3p34/check_for_ai_search_optimization_on_your_w...
1•johncole•1h ago•0 comments

Blinkys – Digital Lifeforms Simulation

https://blinkys.entropicsystems.net/manual.html
1•snorbleck•1h ago•0 comments

Brain bran: The protective effect that fibre has on cognition

https://www.bbc.com/future/article/20260122-the-protective-effect-that-fibre-has-on-cognition
1•breve•1h ago•0 comments

Maia 200: The AI accelerator built for inference

https://blogs.microsoft.com/blog/2026/01/26/maia-200-the-ai-accelerator-built-for-inference/
2•boulos•1h ago•1 comments

Do you have to know every line of code your agent writes?

https://registerspill.thorstenball.com/p/joy-and-curiosity-71
1•alvivar•1h ago•0 comments

XCCache: A caching tool for Xcode projects, with SPM support

https://github.com/trinhngocthuyen/xccache
1•wahnfrieden•1h ago•0 comments

Bald eagle chick watch 2026: Jackie lays first egg

https://www.popsci.com/environment/bald-eagle-jackie-lays-first-egg-2026/
2•WaitWaitWha•1h ago•0 comments

Designing the Powerpuff Girls

https://animationobsessive.substack.com/p/designing-the-powerpuff-girls-512
1•ani_obsessive•1h ago•0 comments