frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: The Poor Man's Guide to Cloud GPU Selection

https://gist.github.com/kyo-takano/0ec81333bee81a72022ddf952a2c88bf
1•kyo_takano•20s ago•0 comments

Ask HN: Grok Code Fast 1 removed?

1•alexellisuk•31s ago•0 comments

KittySploit Framework

https://github.com/SIA-IOTechnology/Kittysploit-framework
1•TheShogan•2m ago•0 comments

NASA 'Emailed' a Wrench into Space (2014)

https://www.businessinsider.com/nasa-emails-a-wrench-into-space-using-3d-printing-2014-12
1•teleforce•2m ago•0 comments

Show HN: Posts p/month more than doubled in the last year

https://petegoldsmith.com/2026/01/26/2026-01-26-show-hn-trends/
1•theraven•6m ago•0 comments

ChatGPT can analyze Apple Watch health data

https://www.washingtonpost.com/technology/2026/01/26/chatgpt-health-apple/
1•reaperducer•7m ago•0 comments

Weird Old Punctuation Marks We Should Bring Back

https://www.mentalfloss.com/language/weird-old-punctuation-marks-bring-back
1•jcynix•7m ago•0 comments

Me/CFS – A Comprehensive Medical Documentation

https://zenodo.org/records/18370022
1•humanfromearth9•10m ago•1 comments

Joel Spolsky: Painless Software Schedules (2000)

https://www.joelonsoftware.com/2000/03/29/painless-software-schedules/
1•MonkeyClub•14m ago•0 comments

KTH Innovation Award 2025: Anton Osika and Fabian Hedin

https://www.kth.se/en/om/nyheter/centrala-nyheter/anton-osika-och-fabian-hedin-kth-innovation-awa...
1•teleforce•14m ago•0 comments

TSMC Risk

https://stratechery.com/2026/tsmc-risk/
1•swolpers•15m ago•0 comments

Ports and Adapters: death by a thousand ports

https://world.hey.com/apetrov/ports-adapters-death-by-a-thousand-ports-8b42afcf
1•apetrov•19m ago•0 comments

Show HN: Only 1 LLM can fly a drone

https://github.com/kxzk/snapbench
2•beigebrucewayne•22m ago•0 comments

ESI Language Specification 1.0

https://www.w3.org/TR/esi-lang/
1•captn3m0•22m ago•0 comments

A study of personality convergence across language models

https://avikrishna.substack.com/p/eliciting-frontier-model-character
1•tjsk•23m ago•0 comments

Copilot committed my repo secrets into AGENTS.md

https://bsky.app/profile/benfoxall.bsky.social/post/3mdcumabg6s2c
3•benjaminbenben•23m ago•1 comments

Transformers V5 is out!

https://github.com/huggingface/transformers/releases/tag/v5.0.0
3•kashifr•24m ago•0 comments

Clawdbot: Personal AI Assistant

https://clawd.bot/
2•puppion•27m ago•0 comments

Show HN: A Neovim plugin to add comments for coding agents

https://github.com/czheo/anno.nvim
1•czheo•29m ago•0 comments

Zero-Knowledge Encrypted Notebook

1•thesecurenote•29m ago•0 comments

Trump Administration Plans to Write Regulations Using Artificial Intelligence

https://www.propublica.org/article/trump-artificial-intelligence-google-gemini-transportation-reg...
3•beardyw•32m ago•0 comments

Show HN: An interactive timeline of computer viruses, worms, and digital threats

https://github.com/rsc-dev/malware-museum.com
1•rsc-dev•33m ago•0 comments

Tell HN: Aden, A YC company, is growth hacking by luring devs with paid work

5•theblazehen•33m ago•0 comments

The Private Equity Roll-Up of HVAC

https://talk24.ai/blog/hvac-private-equity-consolidation
1•atreeleaf•34m ago•0 comments

Building a Sovereign Portfolio Risk Calculator: Why We Ditched the Back End

https://www.pocketportfolio.app/blog
1•pocketportfolio•34m ago•0 comments

FOSDEM 2026 – The Servo project and its impact on the web platform ecosystem [video]

https://fosdem.org/2026/schedule/event/LXFKS9-servo-project-impact/
1•robin_reala•43m ago•0 comments

Robin Williams tickles Coco the monkey

https://www.koko.org/emails/when_robin_met_koko_video/
2•irthomasthomas•44m ago•0 comments

World's Biggest TikToker from Senegal sells company in $900M deal

https://africa.businessinsider.com/local/markets/worlds-biggest-tiktoker-from-senegal-sells-compa...
3•thunderbong•46m ago•0 comments

Free-Coloring-Pages-Generator

https://www.genstory.app/story-template/free-coloring-pages-generator
1•RyanMu•49m ago•1 comments

Ellen MacArthur Foundation Circularity Indicators Flawed?

1•_zero_echo_•52m ago•0 comments