frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

404PageFound – Active Vintage Websites, Old Webpages, and Web 1.0

https://www.404pagefound.com
1•OuterVale•1m ago•0 comments

Ask HN: AI-first SaaS vs. AI-assisted. which one will survive?

1•kathir05•2m ago•0 comments

I built Rubric, an open source Sentry for AI. Looking for beta testers

https://github.com/tryrubric/rubric
1•tryrubric•3m ago•0 comments

How we made Ramp Sheets self-maintaining

https://twitter.com/RampLabs/status/2036165188899012655
1•ramonga•4m ago•0 comments

The Armor Gap

https://danieltan.weblog.lol/2026/03/the-armor-gap
1•danieltanfh95•12m ago•0 comments

Falling Sand Engine/Sandbox

https://sandspiel.club/
1•Imustaskforhelp•13m ago•0 comments

I built a tool that tells you NOT to build your startup idea – DontBuild.It

https://dontbuild.it/
1•dragonmann•15m ago•0 comments

Open-Source AI Text-to-Speech Models You Can Run Locally for Natural Voice

https://firethering.com/best-open-source-tts-models/
1•steveharing1•16m ago•0 comments

Silicon Valley in Your Pocket

https://siliconvalleyinyourpocket.com/
1•larsling•18m ago•0 comments

ICE: $45 an Hour to Stand There. TSA: $0 an Hour to Keep You Safe

https://botonomous.ai/featured/government_shutdown/article-v2.html
2•botonomous•19m ago•0 comments

Ask HN: Google account on old yahoo.com email hijacked to Google Workspace

2•FlyingAvatar•21m ago•0 comments

AI Agents Can Autonomously Perform Experimental High Energy Physics

https://arxiv.org/abs/2603.20179
1•KolenCh•22m ago•1 comments

Asia boosts coal use as Iran war squeezes global LNG supplies

https://www.npr.org/2026/03/24/g-s1-114940/asia-boosts-coal-use-as-iran-war-squeezes-global-lng-s...
1•geox•23m ago•0 comments

The Rise of the Ray-Ban Meta Creep

https://www.wired.com/story/the-rise-of-the-ray-ban-meta-creep/
2•thm•24m ago•0 comments

DOGE Goes Nuclear: How Trump Invited Silicon Valley into Nuclear Power Regulator

https://www.propublica.org/article/trump-nuclear-power-nrc-safety-doge-vought
2•ourmandave•24m ago•0 comments

Coding After Coders: The End of Computer Programming as We Know It

https://www.nytimes.com/2026/03/12/magazine/ai-coding-programming-jobs-claude-chatgpt.html
1•nguyentranvu•25m ago•1 comments

Debunking Zswap and Zram Myths

https://chrisdown.name/2026/03/24/zswap-vs-zram-when-to-use-what.html
3•javierhonduco•29m ago•0 comments

Building a symbolic math REPL in C

https://github.com/marcomit/derive.c
1•marcomit•31m ago•1 comments

Aletheia – deterministic COBOL verification for mainframe migrations

https://github.com/Aletheia-Verification/Aletheia
2•HectorBlai•31m ago•0 comments

Porting Doom to AIX on IBM RS/6000 [video]

https://www.youtube.com/watch?v=XzhCGSE7KKw
1•hxorr•33m ago•0 comments

LLM Neuroanatomy II: Modern LLM Hacking and Hints of a Universal Language?

https://dnhkng.github.io/posts/rys-ii/
2•realberkeaslan•34m ago•0 comments

Ask HN: Anyone trading prediction markets programmatically?

1•sharp_runner_84•36m ago•0 comments

Hack discovered at (NL) Ministry of Finance; Unclear if data was accessed

https://nltimes.nl/2026/03/24/hack-discovered-ministry-finance-unclear-data-accessed
1•mvdwoord•40m ago•0 comments

USA bans all new routers for consumers

https://www.heise.de/en/news/USA-bans-all-new-routers-for-consumers-11222049.html
6•esher•41m ago•0 comments

Synaps a self-hosted personal health monitor as a weighted knowledge grap

https://github.com/scerelli/SYNAPS
1•succo•43m ago•1 comments

Steve Jobs, speech at the Apple campus (1999) [video]

https://www.youtube.com/watch?v=EoM2Y2KO6kU
1•rbinv•43m ago•0 comments

Can It Resolve Doom? Game Engine in 2k DNS Records

https://core-jmp.org/2026/03/can-it-resolve-doom-game-engine-in-2000-dns-records/
2•Einenlum•45m ago•0 comments

Yann LeCun's LeWorldModel: Stable End-to-End JEPA from Pixels

https://le-wm.github.io/
2•matthieu_bl•46m ago•1 comments

America's Chief Financial Officers Say AI Is Coming for Admin Jobs

https://www.wsj.com/tech/ai/ai-admin-job-market-6a1c3436
2•cebert•48m ago•1 comments

The Wrong Abstraction

https://sandimetz.com/blog/2016/1/20/the-wrong-abstraction
2•mihau•50m ago•1 comments