frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•8mo ago

Comments

tocs3•8mo ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

One-Click Clawdbot/Moltbot on Security-Hardened DigitalOcean Droplets

https://www.digitalocean.com/blog/moltbot-on-digitalocean
1•makaimc•43s ago•0 comments

DanceJump for YouTube – Rhythm Dance Game – v0.3.3 Released for Edge

https://microsoftedge.microsoft.com/addons/detail/dancejump-for-youtube-r/kjcikodgaapodnjkhhmaobb...
1•maaydin•1m ago•1 comments

A practical primer on confidential computing

https://github.com/lunal-dev/home/tree/main/docs/confidential-computing-primer
2•grun•2m ago•0 comments

Codex Daily Benchmarks for Degradation Tracking (Marginlab.ai)

https://marginlab.ai/trackers/codex/
1•wendgeabos•3m ago•0 comments

XCCache: Faster Swift builds, less waiting

https://xccache.trinhngocthuyen.com
1•wahnfrieden•3m ago•0 comments

What I found reading Claude's leaked 57K-word system prompts

1•jbetala7•3m ago•1 comments

Show HN: KnowledgeForAI – remote MCP for various data sources

https://knowledgeforai.com/
1•winchester6788•4m ago•0 comments

Tell HN: Beeper deletes inactive accounts without notice

1•kldx•5m ago•0 comments

Patients Are Often More Honest with AI Than Clinicians [video]

https://www.youtube.com/watch?v=97HLETD7CGY
1•vitlyoshin•5m ago•1 comments

Show HN: Visual bug reports with screenshots, console logs, and network requests

https://feedbackotter.com
1•mohitgangrade•8m ago•1 comments

Younger Americans see U.S. dominance slipping to China

https://www.axios.com/2026/01/28/american-gen-z-china-competition-economics
2•giuliomagnifico•8m ago•0 comments

Project Genie: An experimental research prototype

https://www.threads.com/@google/post/DUGhcK8kvX-
1•simonpure•8m ago•0 comments

Claude and I have a proper first date

https://h4x0r.org/a-date-with-claude/
1•eatonphil•8m ago•0 comments

EU/CoE country badge-generator

https://country-badges.eu/
2•AxelWickman•9m ago•0 comments

Verge: Formal Refinement and Guidance Engine for Verifiable LLM Reasoning

https://arxiv.org/abs/2601.20055
2•vikashjohn2505•9m ago•1 comments

What escapes containment is least valuable

https://hollisrobbinsanecdotal.substack.com/p/what-escapes-containment-is-less
1•HR01•12m ago•0 comments

Common Plastic Chemical BPA Found to Feminize Males and Masculinize Females

https://scitechdaily.com/common-plastic-chemical-found-to-feminize-males-and-masculinize-females/
2•OutOfHere•12m ago•0 comments

Krawl: A honeypot and deception server one month lather

https://demo.krawlme.com/das_dashboard
2•blessedrebus•13m ago•1 comments

Royal Navy forces Russian ship out of British waters

https://www.telegraph.co.uk/news/2026/01/28/russian-ship-anchors-trans-atlantic-cables-bristol-ch...
1•speckx•14m ago•0 comments

Milky Way is embedded in a 'large-scale sheet' of dark matter

https://phys.org/news/2026-01-milky-embedded-large-scale-sheet.html
1•rbanffy•14m ago•0 comments

Programming as Theory Building [pdf]

https://pablo.rauzy.name/dev/naur1985programming.pdf
3•SchwKatze•17m ago•0 comments

Building Cryptographic Agility into Sigstore

https://blog.trailofbits.com/2026/01/29/building-cryptographic-agility-into-sigstore/
2•CiPHPerCoder•19m ago•0 comments

Ask HN: How do you evaluate whether a CV research idea is worth pursuing?

1•mostlyk•20m ago•0 comments

Adding dynamic features to an aggressively cached website

https://simonwillison.net/2026/Jan/28/dynamic-features-static-site/
1•ulrischa•21m ago•0 comments

South Korea's 'world-first' AI laws face pushback

https://www.theguardian.com/world/2026/jan/29/south-korea-world-first-ai-regulation-laws
1•lnguyen•22m ago•0 comments

Show HN: Guide to Writing Better AI Prompts

https://howtomakethebestprompt.com/
1•detroitwebsites•23m ago•0 comments

The Largest Zip Tie Is Nearly 4 Feet Long and $75

https://www.thedrive.com/news/youll-have-that-on-those-big-jobs-the-worlds-largest-zip-tie-is-nea...
1•PaulHoule•23m ago•0 comments

Shift more left with coding agents

https://gricha.dev/blog/shift-more-left-with-coding-agents
2•surprisetalk•24m ago•0 comments

FAQ: Memorization

https://pgadey.ca/notes/faq-memorization/
1•surprisetalk•24m ago•0 comments

Plantable Brings Plants and Tables Together in the Workplace

https://design-milk.com/plantable-brings-plants-and-tables-together-in-the-workplace/
2•surprisetalk•24m ago•0 comments