frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Argus – multi‑agent AI coding assistant that never gets stuc

https://github.com/argustek/Argus
1•argustek•4m ago•0 comments

Why the Smart Home Bubble Popped

https://hackaday.com/2026/05/21/why-the-smart-home-bubble-popped/
2•lxm•5m ago•0 comments

The open-source ACP orchestrator

https://github.com/alfonsodg/aintegrix
1•alfonsodg•14m ago•0 comments

GSD [NPM get-shit-done-cc] is now OpenGSD

https://www.opengsd.net/
1•_blk•15m ago•0 comments

Show HN: Rapel – chunked resumable downloads in unstable networks

https://github.com/redraw/rapel
1•autorun•16m ago•0 comments

Webwright: A Terminal Is All You Need for Web Agents

https://www.microsoft.com/en-us/research/articles/webwright-a-terminal-is-all-you-need-for-web-ag...
3•pyinstallwoes•20m ago•0 comments

Self-hosting Wafrn behind another Caddy, with Bluesky support

https://blog.goodanser.com/fediverse/post/03c6a4f4-cdd4-4027-a6a4-5017dd9154bd
1•mooreds•22m ago•0 comments

Mashing up modelling techniques for fun and profit

https://event-driven.io/en/on-mashing-up-modelling-techniques/
1•mooreds•23m ago•0 comments

The Ask

https://randsinrepose.com/archives/the-ask/
2•mooreds•24m ago•0 comments

SK Group chairman says memory chip shortage will last until 2030

https://www.tomshardware.com/pc-components/dram/sk-group-chairman-says-memory-chip-shortage-will-...
3•SiqingYu•29m ago•0 comments

Publishing's Latest Piracy Problem: Audiobooks on YouTube

https://www.nytimes.com/2026/05/21/books/audiobook-piracy-youtube.html
2•lxm•33m ago•0 comments

Waymo suspends all freeway rides over safety issues

https://sfstandard.com/2026/05/21/waymo-suspends-all-freeway-rides-safety-issues/
3•romanhn•43m ago•0 comments

Artificial Intelligence Floods Court Dockets with Home-Brewed Lawsuits

https://www.nytimes.com/2026/05/25/us/politics/artificial-intelliegence-courts.html
2•jrmg•45m ago•1 comments

Human-Made Materials Now Weigh More Than All Life on Earth Combined (2020)

https://www.smithsonianmag.com/smart-news/human-made-materials-now-weigh-more-all-life-earth-comb...
2•thunderbong•48m ago•0 comments

GitHub commit Verification logic flaw and bypass

1•handwritter•53m ago•0 comments

Brockovich AI Data Center Reporting

https://www.brockovichdatacenter.com/
2•cdrnsf•54m ago•0 comments

Pinned – daily geography pin-drop game (pinned.engineering)

https://www.pinned.engineering/
1•Hddharry•56m ago•0 comments

Models Have Blind Spots: Debugging Unfamiliar Code with a Multi-LLM Loop

https://sosuke.com/models-have-blind-spots-debugging-unfamiliar-code-with-a-multi-llm-loop/
1•sosuke•59m ago•0 comments

Show HN: Pgcraft – a lazygit-style TUI for Postgres

https://github.com/lucasfrederico/pgcraft
3•lucasfrederico•1h ago•0 comments

LibreOffice Tips and Tricks: Replacing Microsoft Fonts (2020)

https://blog.documentfoundation.org/blog/2020/09/08/libreoffice-tt-replacing-microsoft-fonts/
1•bariumbitmap•1h ago•0 comments

Ente's Legacy Kit Feature

https://ente.com/blog/legacy-kit/
1•gurjeet•1h ago•0 comments

Matchmaker: A Powerful and Modern Searcher

https://github.com/Squirreljetpack/matchmaker
2•squirreljetpack•1h ago•1 comments

MileStone: A Multi-Objective Compiler Phase Ordering Framework

https://arxiv.org/abs/2605.23435
1•matt_d•1h ago•0 comments

State of the Fin 2026-05-24

https://jellyfin.org/posts/state-of-the-fin-2026-05-24/
2•salmon•1h ago•0 comments

Crypto code commits fall 75% as developers move to AI projects

https://www.coindesk.com/tech/2026/03/12/crypto-developer-activity-sinks-to-multi-year-low-as-ai-...
5•wslh•1h ago•0 comments

Cited AI Workspace: No More Re-Uploading Files

https://uumuse.ai/en
2•owjdie•1h ago•0 comments

LLM proactively bypassed pnpm's anti-supply-chain-attack config

https://twitter.com/encrypted/status/2058658244328124562
1•EFLKumo•1h ago•0 comments

Show HN: Embed Notion Pages into Your Website

https://embednotion.com/
2•qwikhost•1h ago•0 comments

Stop paying twice Looking for testers for self hosted+Android app cloud drive

https://play.google.com/apps/testing/com.freecloud.android
2•WWIII_Historian•1h ago•0 comments

Does Anybody Actually Like React?

https://jsx.lol
81•brazukadev•1h ago•85 comments