frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

AWS Bedrock to require sharing data with Anthropic for Mythos and future models

1•TomAnthony•5m ago•0 comments

SpaceX IPO demand is approaching four times oversubscribed

https://www.reuters.com/world/spacex-ipo-demand-is-approaching-four-times-oversubscribed-source-s...
1•JumpCrisscross•8m ago•0 comments

The German town where people print their own money [video]

https://www.youtube.com/watch?v=WzijH3lgzHI
1•Cider9986•8m ago•0 comments

Build an Atmospheric Website

https://atproto.com/blog/atmospheric-website
1•ZacnyLos•8m ago•0 comments

Linux developers are using AI vibe coding to keep vintage AMD GPUs alive

https://www.tomshardware.com/software/linux/linux-developers-are-using-ai-vibe-coding-to-keep-vin...
1•01-_-•9m ago•0 comments

Bluesky was launched as a Twitter rival – but it's less popular

https://www.cnbc.com/2026/06/04/bluesky-twitter-rival-reddit-social-media.html
1•01-_-•10m ago•1 comments

Ukraine builds cheap alternative to Patriot missiles

https://www.ft.com/content/c5839dd4-c4e9-4503-a605-67dcef053845
1•JumpCrisscross•10m ago•0 comments

The Vibes Were Never the Point. On the Push-and-Pull of AI Usage

https://blog.ewancroft.uk/3mnw2znjfbc22
2•ZacnyLos•11m ago•0 comments

Total War: Warhammer 40000 – Alpha Preview

https://www.youtube.com/watch?v=6aHYFL7dnuw
2•tomaytotomato•14m ago•0 comments

MCP Server Toolkit – Plug-and-Play

https://github.com/naveenayalla1-CS50/mcp-server-toolkit
2•Naveenayalla1•14m ago•0 comments

PgDog – connection pooler and load balancer for Postgres

https://pgdog.dev/
2•JustSkyfall•14m ago•0 comments

Coding is solved: one guy reverse-engineered Claude Desktop for Linux via Claude

https://github.com/aaddrick/claude-desktop-debian
4•landsman•14m ago•0 comments

We're spending 24 hours using local LLMs to search for the meaning of life

https://eternal-question.vercel.app/
1•piyussh•19m ago•2 comments

Germany's €100B bid to make the trains run on time

https://www.ft.com/content/db75e347-b13b-4753-8130-6301bb55c040
1•JumpCrisscross•22m ago•1 comments

Is your project Agent-Ready?

https://github.com/jaksa76/agentize
1•jaksa•23m ago•1 comments

What I got wrong about fast terminals

https://mijndertstuij.nl/posts/what-i-got-wrong-about-fast-terminals/
1•birdculture•25m ago•0 comments

Purpose-built local AI agents

https://samihonkonen.com/posts/purpose-built-local-ai-agents/
1•shonkone•26m ago•0 comments

The importance of food sovereignty in Puerto Rico

https://hothouse.substack.com/p/small-farming-finds-its-way-in-the
2•heyimada•26m ago•0 comments

UK Veterans Are Missing Out on Defence Tech Jobs That Need Them Most

https://vulpesetleo.substack.com/p/british-veterans-missing-out-on-jobs
2•hnjm•29m ago•1 comments

Show HN: Loom, an open-source delivery harness for coding agents

https://github.com/valkor-ai/loom
3•buzzplayapp•29m ago•0 comments

PowerToys 0.100 Is Here

https://devblogs.microsoft.com/commandline/powertoys-0-100-is-here-new-shortcut-guide-command-pal...
1•Klaster_1•30m ago•0 comments

CTD Clinic is fake medical paperwork for transmissible internet stupidity

https://ctd.clinic/
1•mgl•32m ago•0 comments

Why does tsgo use so much memory?

https://zackoverflow.dev/writing/why-does-tsgo-use-so-much-memory/
1•flashblaze•33m ago•0 comments

How to Build an Agentic RAG with RubyLLM and Rails

https://www.panasiti.me/blog/how-to-build-agentic-rag-with-rubyllm-and-rails/
2•giovapanasiti•34m ago•0 comments

Mining a Terms-of-Service fairness rubric from labelled data with DSPy and GEPA

https://medium.com/empirical-engineer/gepa-wrote-its-own-legal-rubric-and-caught-33-more-unfair-c...
1•tassosyal•34m ago•0 comments

PGM-index:range searches, deletes, updates using orders of magnitude less space

https://pgm.di.unipi.it/
1•hamilyon2•35m ago•0 comments

First Valhalla related stuff will land in Java 28

https://mail.openjdk.org/archives/list/jdk-dev@openjdk.org/thread/AIA3O3LHFZ6T7TIPH7KZT4WS4B6U72U5/
2•lichtenberger•35m ago•0 comments

Nick Reiner seeks trust fund left by parents to pay for defense in their killing

https://www.nbcnews.com/news/us-news/nick-reiner-seeks-access-trust-parents-left-pay-defense-kill...
2•Michelangelo11•39m ago•0 comments

Looking for volunteers to help with my AI-generated website

1•petebay•39m ago•0 comments

Mercedes‑Benz starts large‑scale production of electric axial flux motor

https://media.mercedes-benz.com/en/article/bebac2af-acdc-465a-9538-adb0bf3d8ccf
2•raffael_de•42m ago•0 comments