frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•9mo ago

Comments

tocs3•9mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

The Self-Help Trap: What 20 Years of "Optimizing" Has Taught Me

https://tim.blog/2026/03/04/the-self-help-trap/
1•bonefishgrill•3m ago•0 comments

Improving Django Admin UI with Django-unfold

https://unfoldadmin.com/
1•madatbay•6m ago•1 comments

A GB300 thread that running vLLM and SGlang on it

https://twitter.com/xu_paco/status/2029433226234868178
1•pacoxu2025•8m ago•0 comments

Show HN: Your AI Slop Bores Me

https://www.youraislopbores.me/
1•mikidoodle•8m ago•0 comments

Gogcli – Google in Your Terminal

https://github.com/steipete/gogcli
2•nstj•14m ago•0 comments

Show HN: Nemilia – multi-agent AI workspace in a single HTML file, no back end

https://github.com/luislopez1212/Nemilia
2•Nemilia•15m ago•0 comments

FFmpeg at Meta: Media Processing at Scale

https://engineering.fb.com/2026/03/02/video-engineering/ffmpeg-at-meta-media-processing-at-scale/
2•articsputnik•16m ago•0 comments

The New York Earth Room

https://www.niche-museums.com/117
2•Tomte•18m ago•0 comments

The Calm List

https://world.hey.com/raulp/the-calm-list-cb7be8f8
3•adrianthedev•19m ago•0 comments

I refused to pay $30/month for Superhuman so I built my own Gmail organizer

https://github.com/Lakshay1509/NeatMail
3•mafia15•20m ago•1 comments

Tuneithub.com turn any Markdown into Google Docs like collab

https://www.get-colibri.com/
3•mlysk•21m ago•1 comments

Remote contractors are now the new normal for businesses

2•emmanol•21m ago•0 comments

Muon: An optimizer for hidden layers in neural networks

https://kellerjordan.github.io/posts/muon/
2•tosh•23m ago•0 comments

modded-nanogpt: NanoGPT (124M) in 2 minutes

https://github.com/KellerJordan/modded-nanogpt
2•tosh•24m ago•0 comments

The day Iran buried Ayatollah Khomeini (2015) [video]

https://www.bbc.com/news/av/magazine-32938264
2•thomassmith65•29m ago•1 comments

Iraq suffers power grid blackout due to technical fault

https://www.reuters.com/business/energy/iraq-suffers-power-grid-blackout-due-technical-fault-2026...
3•yreg•31m ago•1 comments

Sneak peek at the redesigned Stack Overflow

https://stackoverflow.blog/2026/02/25/your-sneak-peek-at-the-redesigned-stack-overflow/
2•SerCe•32m ago•0 comments

How Vulnerable Are Computers to 80-Year-Old Spy Technique? Congress Wants Answer

https://www.wired.com/story/how-vulnerable-are-computers-to-an-80-year-old-spy-technique-congress...
2•walterbell•33m ago•0 comments

Zorin OS: The Alternative to Windows and macOS

https://zorin.com/
2•thunderbong•36m ago•1 comments

Google Play Store will shame developers of sloppy, battery-wasting apps

https://www.neowin.net/news/google-play-store-will-shame-developers-of-sloppy-battery-wasting-apps/
1•bundie•37m ago•0 comments

Show HN: The Forensic Mirror – Weaponizing LLMs for Cognitive Auditing

https://github.com/type-null/forensic-mirror
1•llmmirror•38m ago•1 comments

Writing about Agentic Engineering Patterns

https://simonwillison.net/2026/Feb/23/agentic-engineering-patterns/
1•kristianp•39m ago•0 comments

Ask HN: How valuable is production scale experience vs. LeetCode in startups?

2•winsongr•45m ago•0 comments

OpenBSD on SGI: A Rollercoaster Story

http://miod.online.fr/software/openbsd/stories/sgiall.html
2•brynet•46m ago•0 comments

Source code for Minecraft Legacy Console Edition has leaked online

https://spilled.gg/minecraft-legacy-console-edition-source-code-builds/
1•ObviouslyFlamer•46m ago•0 comments

The Ada Lovelace Fable

https://twitter.com/gtredoux/status/2029300803144696309
2•MrBuddyCasino•47m ago•0 comments

The Modern Search Engine: The Complete Pipeline – How It Ranks Results

https://blog.ivan.digital/inside-the-modern-search-engine-the-complete-pipeline-how-it-ranks-resu...
1•ipotapov•49m ago•0 comments

Jira tasks can now write their own code (OpenAI Symphony)

https://github.com/openai/symphony/blob/main/README.md
1•bakigul•49m ago•1 comments

Chevron Warns of Irreversible Harm to Calif Economy in Letter to Gov. Newsom

https://californiaglobe.com/fl/chevron-warns-of-irreversible-harm-to-californias-economy-and-ener...
1•mudil•51m ago•0 comments

Social Capital Is a Design Choice: A Markov Framework for AI Trust

https://weightedthoughts.substack.com/p/social-capital-is-a-design-choice
1•ylliprifti•52m ago•1 comments