frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•10mo ago

Comments

tocs3•10mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Grok scored zero on ARC-AGI-3. Every 5-year-old did better

https://aitwerp.com/signals/agi-benchmark-five-year-old-wins/
1•Inziu•2m ago•0 comments

Retraction of high-profile reproducibility study prompts soul-searching

https://www.nature.com/articles/d41586-024-03178-8
1•paulpauper•3m ago•0 comments

What I've Been Reading

https://marginalrevolution.com/marginalrevolution/2026/04/what-ive-been-reading-285.html
1•paulpauper•3m ago•0 comments

The Happiness Crash of 2020

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=6465460
1•paulpauper•4m ago•0 comments

Thoughts on the SaaSpocalypse

https://www.davidbatey.com/blog/thoughts-on-the-saaspocalypse
1•davidlbatey•4m ago•1 comments

Bun OPDS Server for Xteink X4

https://github.com/rcarmo/bun-opds-server
1•rcarmo•5m ago•0 comments

Show HN: Spotlytt, Market your soft+technical skills to Hiring Team

1•pbs29•5m ago•0 comments

Anyone here attended dotJS/dotAI conferences in Paris?

1•blumebee•6m ago•0 comments

Nations priced out of Big AI are building with frugal models

https://restofworld.org/2026/frugal-ai-big-tech/
2•Brajeshwar•7m ago•0 comments

Ask HN: What are you moving on to now that Claude Code is so rate limited?

3•esperent•8m ago•1 comments

Ubiquitous data-driven framework for traffic emission estimation

https://www.nature.com/articles/s41893-026-01797-9
1•thunderbong•9m ago•0 comments

How to Win at Competitive Analysis

https://www.leadinginproduct.com/p/competitive-analysis
1•benkan•10m ago•0 comments

Wan2.7-Image Is Launched

https://wan27image.net
1•Jenny249•12m ago•0 comments

Show HN: I Built JASD – Just a Simple Downloader

https://github.com/MaRcR11/jasd
1•MaRcR11•13m ago•1 comments

New MSP platform to manage IT and cybersecurity

https://pinkduckcompany.com/docs/
1•hyperquack•13m ago•0 comments

Anthropic's next model could be a 'watershed moment' for cybersecurity

https://www.channel3000.com/news/technology/anthropic-s-next-model-could-be-a-watershed-moment-fo...
3•xbryanx•14m ago•0 comments

What happens when you don't die on time?

https://ottawacitizen.com/news/when-you-dont-die
1•speckx•14m ago•0 comments

Open-Source Edge Functions Runtime (Bun and JavaScript)

https://github.com/henriquemafra/dropfunctions
1•henriquemafra•15m ago•0 comments

Baby's Second Garbage Collector

https://www.matheusmoreira.com/articles/babys-second-garbage-collector
1•stevekemp•16m ago•0 comments

Python 3.4: Beyond Scripting – Building Scalable Systems

https://techlife.blog/posts/python-34-beyond-scripting/
1•tsenturk•16m ago•0 comments

Ask HN: Which CLI tools do you use daily?

2•elC0mpa•17m ago•2 comments

Using LLMs to build personal knowledge bases for various topics

https://twitter.com/i/status/2039805659525644595
5•redbell•18m ago•1 comments

A reproducible C toolchain rooted on POSIX shell

https://umontreal.scholaris.ca/items/2f44323a-9f4f-482a-98be-542d8ee5b9fb
2•laurenth•18m ago•0 comments

A Survey of Quantum Theory Inspired Approaches to Information Retrieval

https://arxiv.org/abs/2007.04357
1•9wzYQbTYsAIc•19m ago•0 comments

2026 Emoji Submissions

https://jenniferdaniel.substack.com/p/emoji-season-is-open
2•lacieargyle•23m ago•0 comments

Life Before Unicode – Character Sets and Code Pages at the Push of a Button

http://www.i18nguy.com/unicode/codepages.html#msftdos
1•shrikaranhanda•24m ago•0 comments

eGPU for Mac

https://docs.tinygrad.org/tinygpu/
2•LorenDB•26m ago•0 comments

I built an engine that only recomputes changed UI layout nodes

https://inval.bluephantom.dev
1•hemanth05•27m ago•0 comments

Trump Administration Orders Dismantling of the U.S. Forest Service

https://morethanjustparks.substack.com/p/breaking-trump-administration-orders
9•speckx•27m ago•1 comments

Chinese chip firms hit record high revenue driven by the AI boom and U.S. curbs

https://www.cnbc.com/2026/04/03/chinese-chip-firms-record-revenue-ai-boom-us-curbs.html
1•Brajeshwar•28m ago•0 comments