frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

The Miller Principle

https://puredanger.github.io/tech.puredanger.com/2007/07/11/miller-principle/
2•FelipeCortez•1m ago•0 comments

How to Support Notebooks in a Language Server

https://pyrefly.org/blog/notebook/
2•ocamoss•2m ago•0 comments

Show HN: I used NLP to turn UK planning PDFs into a clean CSV

https://www.kaggle.com/datasets/strictschema/uk-planning-decisions-schema-sample
1•david_s_data•2m ago•1 comments

Model-Based Testing for Dungeons & Dragons

https://www.loskutoff.com/blog/model-based-testing-dnd/
1•Firfi•2m ago•2 comments

Recurring patterns aren't problems to solve. They're signals to read

https://medium.com/@genady_awarelife/why-today-looks-exactly-like-yesterday-6424362a80d4
1•genadym•2m ago•0 comments

Millions Across Europe Urged to Work from Home

https://www.newsweek.com/millions-across-europe-urged-work-from-home-energy-crisis-iran-war-11766442
1•robtherobber•3m ago•0 comments

Show HN: Fleet – I built a multi-agent dev team that runs as a bash script

https://danrex.github.io/blog/replaced-dev-team-with-bash-script/
1•christiangraf•3m ago•0 comments

ArtistKit – Free WordPress plugin for musicians Press kits

https://promotracker.fr/artistkit
1•davidabakan•3m ago•0 comments

Amesh – Replace static API keys with device-bound ECDSA identities

https://github.com/ameshdev/amesh
3•yau2026•6m ago•0 comments

Show HN: Nexulta – AI-powered crypto price predictions and market analysis

https://nexulta.com
2•enesz•6m ago•0 comments

Octopoda open source agent OS with memory, loop detection, and audit trails

https://github.com/RyjoxTechnologies/Octopoda-OS
2•Josephjackjrob1•8m ago•0 comments

Newsletify – Newsletter Sponsorship

https://newsletify.com/
2•ivanclaudiu•8m ago•0 comments

Free AI Video Generators in 2026: What Works

https://frameloop.ai/blog/free-ai-video-generator-tools-2026
2•avinashvagh•9m ago•1 comments

The best tools for sending an email if you go silent

https://blog.alcazarsec.com/posts/best-email-dead-mans-switches
3•alcazar•9m ago•0 comments

Show HN: Rac-delta – open protocol for differential dir sync (Rust/Node SDKs)

https://raccreative.github.io/rac-delta-docs/
2•Raccreative•10m ago•0 comments

Trump's next budget once again calls for cuts to science

https://arstechnica.com/science/2026/04/trumps-next-budget-once-again-calls-for-massive-cuts-to-s...
1•tartoran•10m ago•0 comments

An experimental Linux distribution that Redefines the filesystem hierarchy

https://www.gobolinux.org/
3•rainingmonkey•10m ago•0 comments

Attention Is All You Need, but All You Can't Afford – Hybrid Attention

2•JohannaAlmeida•11m ago•0 comments

Show HN: AgentLint – ESLint for your coding agents

https://github.com/samilozturk/agentlint
3•onurkanbkrc•13m ago•1 comments

Live Life on the Edge: A Layered Strategy for Testing Data Models

https://www.chiply.dev/post-data-model-testing
3•chiply•13m ago•1 comments

Show HN: 100 days running an AI agent – what behavioral drift looks like

https://cathedral-ai.com/cathedral-beta
1•mike-ward•13m ago•0 comments

Show HN: Real-time surveys via QR code, built with Cloudflare Durable Objects

https://rifts.to/blog/building-real-time-surveys-cloudflare-durable-objects
1•heffrey•14m ago•2 comments

Why 'Cost Disease' Is the Secret Force Behind America's Toxic Solitude

https://www.derekthompson.org/p/why-cost-disease-is-the-secret-force
2•alihm•14m ago•0 comments

CLI that gives AI agents structured SEO data for any URL – SGNL

https://github.com/stoyan-koychev/sgnl-cli
1•koychev•16m ago•0 comments

Show HN: A social feed with no algo where communities decide what gets seen

https://veridonia.com
2•smnkgv•16m ago•0 comments

Show HN: An open source CI/CD action to audit and fix AI generated UI code

3•heisen_berg•17m ago•0 comments

Copilot CLI can now ask a second model to critique the first

https://github.blog/ai-and-ml/github-copilot/github-copilot-cli-combines-model-families-for-a-sec...
2•summarity•18m ago•0 comments

Software Bonkers

https://craigmod.com/essays/software_bonkers/
1•theshrike79•18m ago•0 comments

A Fire Sale Has U.S. Office Buildings Going for 90% Off

https://www.wsj.com/real-estate/commercial/a-fire-sale-has-u-s-office-buildings-going-for-90-off-...
2•blinding-streak•18m ago•0 comments

AI agents that teach, debate and quiz you in real time

https://github.com/THU-MAIC/OpenMAIC
1•steveharing1•18m ago•0 comments