frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Free Alt to Meetup.com / Craigslist – Community Based Classifieds (US Only ATM)

https://flyersky.org/
1•fullstacking•2m ago•0 comments

Against All Odds: The Mathematics of 'Provably Fair' Casino Games

https://philippdubach.com/posts/against-all-odds-the-mathematics-of-provably-fair-casino-games/
1•7777777phil•5m ago•0 comments

Nvidia Nemotron 3-Nano 30B-A3B-BF16

https://huggingface.co/nvidia/NVIDIA-Nemotron-3-Nano-30B-A3B-BF16
1•tosh•5m ago•0 comments

Apollo team: Apollo-Core GOLD 2.18 in development

https://amiga-news.de/en/news/AN-2026-01-00123-EN.html
1•doener•6m ago•0 comments

The two agentic loops – how to build and scale agentic apps

https://planoai.dev/blog/the-two-agentic-loops-how-to-design-and-scale-agentic-apps
1•sparacha•9m ago•0 comments

Extensions are stealing your data

https://wardblog.substack.com/p/extensions-are-stealing-your-data
1•bennydog224•11m ago•0 comments

Show HN: PrivaseeAI – iOS threat detection after my carrier-level hack

https://medium.com/@markus_31/i-was-hacked-at-the-carrier-level-heres-the-open-source-tool-i-buil...
1•aurelianware•14m ago•1 comments

Elon Musk's SpaceX and xAI Are Planning a Megamerger of Rockets and AI

https://www.wsj.com/tech/elon-musks-spacex-and-xai-are-planning-a-megamerger-of-rockets-and-ai-28...
1•bookofjoe•15m ago•1 comments

A More Perfect Union

https://www.symmetrybroken.com/a-more-perfect-union/
2•riemannzeta•15m ago•0 comments

Ask HN: Weekend Social: What personal chore did you automate recently?

3•susam•16m ago•1 comments

Show HN: Rechain – A daily word puzzle about semantic bridges

https://rechain.me
1•dammdanyal•21m ago•0 comments

An algebra bedtime fable to fall asleep to

https://shukla.io/blog/2021-11/algebra-notes.html
1•BinRoo•21m ago•0 comments

Show HN: Agent Tinman – Autonomous failure discovery for LLM systems

https://github.com/oliveskin/Agent-Tinman
1•oliveskin•22m ago•0 comments

Reincarnation Game: Choose Your Region Wisely

https://reincarnate.fly.dev/
1•manujkant•23m ago•0 comments

The L4Re Operating System Framework

https://l4re.org/
1•doener•24m ago•0 comments

Islechat: Tiny chat server powered by SSH

https://github.com/ashfn/islechat
2•thunderbong•25m ago•0 comments

Show HN: Web Time Machine – Explore internet history with a timeline

https://web-time-travel.vercel.app
1•ashokmarannan•27m ago•1 comments

Show HN: Ctxbin – A deterministic CLI for reliable AI agent handoffs

https://github.com/superlucky84/ctxbin
1•superlucky84•28m ago•1 comments

Eagle Mode: a zoomable user interface

https://eaglemode.sourceforge.net/
1•fanf2•28m ago•0 comments

Google Cloud suspended my account for 2 years, only automated replies

3•andylizf•28m ago•0 comments

Software: Solo or Team?

https://boragonul.com/blog/software-solo-or-team/
2•bora_gonul•29m ago•1 comments

Show HN: Openground, and on-device and open source alternative to Context7

https://github.com/poweroutlet2/openground
1•poweroutlet2•30m ago•0 comments

Show HN: Unicode is weird so I built a site to make text cooler anywhere

https://fontgen.cool/
1•liquid99•33m ago•0 comments

Synth Town

https://synth.town
2•count_zero•35m ago•1 comments

SeL4: The most highly assured and fastest operating system kernel

https://sel4.systems/
2•doener•36m ago•0 comments

California Senate passes bill regulating lawyers' use of AI

https://www.reuters.com/legal/government/california-senate-passes-bill-regulating-lawyers-use-ai-...
4•1vuio0pswjnm7•37m ago•0 comments

Nova OS Virtualization Architecture

https://hypervisor.org/
1•doener•37m ago•0 comments

New heat-shrinking method integrates electronic circuits on irregular shapes

https://techxplore.com/news/2026-01-method-electronic-circuits-irregular.html
2•PaulHoule•37m ago•0 comments

Still conscious? Brain marker signals when anaesthesia takes hold

https://www.nature.com/articles/d41586-026-00301-9
1•bookofjoe•38m ago•1 comments

Trump Officials Move to Double Number of H-2B Guest Visas This Year

https://www.nytimes.com/2026/01/30/us/politics/h2b-visas.html
3•ripe•39m ago•1 comments