frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Trump admin allows Anthropic to release Mythos AI model to some companies

https://www.cnbc.com/2026/06/26/us-government-anthropic-claude-mythos5-ai.html
1•dataking•1m ago•0 comments

US releases powerful Anthropic model Mythos to some US companies

https://www.semafor.com/article/06/27/2026/us-releases-powerful-anthropic-model-mythos-to-some-us...
1•wasting_time•4m ago•0 comments

Rocket Lab launches 10th Synspective satellite – SpaceNews

https://spacenews.com/rocket-lab-launches-10th-synspective-satellite/
1•rbanffy•6m ago•0 comments

Interview – ADHD Child vs. Non-ADHD Child [video]

https://www.youtube.com/watch?v=-IO6zqIm88s
1•gurjeet•9m ago•1 comments

You can't always trust a BMC's inventory of the server's hardware

https://utcc.utoronto.ca/~cks/space/blog/tech/BMCDontTrustHardwareInventory
1•LorenDB•10m ago•0 comments

Due to the AI memory crisis, Apple is bringing back the original Apple I

https://old.reddit.com/r/MacStudio/comments/1ugkygr/due_to_the_ai_memory_crisis_apple_is_bringing/
1•akirahittoxyz•11m ago•0 comments

The US lifts its block on Mythos 5

https://twitter.com/Techmeme/status/2070638481265905837
7•bobrenjc93•13m ago•1 comments

Ask HN: Any OSS models as good as GPT-4o-mini?

2•ra0x3•17m ago•0 comments

Mankato Unofficial Website

http://city-mankato.us/
2•limbicsystem•17m ago•0 comments

Why does kinetic energy increase quadratically, not linearly, with speed? (2011)

https://physics.stackexchange.com/questions/535/why-does-kinetic-energy-increase-quadratically-no...
3•ProxyTracer•18m ago•0 comments

I built a tiny free app to track money saved by skipping small purchases

https://skipd.coffee/
1•dariyam•21m ago•0 comments

Show HN: Imap2gmail – A self-contained mass-migrations orchestrator for Gmail

https://github.com/overflowy/imap2gmail
1•overflowy•23m ago•0 comments

Workbench: A TUI for parallel coding agents

https://github.com/erikqu/workbench-cli
3•erikqu•23m ago•1 comments

Assessing GPT-5.6 Sol Against Cybersecurity Benchmarks

https://www.irregular.com/research/assessing-gpt-5.6-sol
1•edanm•25m ago•0 comments

Show HN: Skillmaxxing – make every agent self-evolving

https://github.com/Bennyoooo/skillmaxxing
2•bennyjiang•25m ago•0 comments

AI in Mathematics Is Forcing Big Questions

https://spectrum.ieee.org/ai-in-mathematics
2•rbanffy•25m ago•0 comments

Inference Cards

https://cmart.blog/inference-cards/
1•zdw•26m ago•0 comments

Heavener: What happens when you can't afford EDR licenses

https://blog.otterpwn.com/projects/heavener
1•hexagr•29m ago•0 comments

GeoSpoof vs. Geoceptor – comparing two iOS location spoofing tools

https://geospoof.com/blog/geoceptor-vs-geospoof
1•sgro•32m ago•0 comments

Ask HN: Model access depends on citizenship. What should Non-US founders do?

1•recsv-heredoc•32m ago•1 comments

Structured Primary Keys

https://modern-sql.com/blog/2026-06/structured-primary-keys
1•birdculture•32m ago•0 comments

Tell Zillow: Fee-Simple vs. Leasehold Filter

2•HoldOnAMinute•36m ago•1 comments

How to Make the World's Best Black Shirt [video]

https://www.youtube.com/watch?v=u_BdsucFI9E
1•riknos314•41m ago•0 comments

Show HN: All-in-one memory for AI Agents

https://parcle.ai/second-brain
1•longtermop•41m ago•0 comments

iOS 27 Adds Mac-Like Recovery Mode for iPhone and iPad

https://www.macrumors.com/2026/06/22/ios-27-adds-mac-like-recovery-mode/
2•antfarm•42m ago•0 comments

Show HN: RAG Vector DB Cost Calculator

https://tools.superml.org/calculators/rag-vector-db-cost-calculator
1•bps1418•43m ago•0 comments

Academy Software Foundation Launches New Wayland for Artists Working Group

https://www.aswf.io/blog/academy-software-foundation-launches-new-wayland-for-artists-working-group/
2•agluszak•44m ago•0 comments

I've built an iOS app to spoof location, no PC needed

1•alienshello•50m ago•0 comments

Skill for generating cheatsheet PDF optimized for the reMarkable eink

https://github.com/Deca/remarkable-cheatsheet
1•Decag•52m ago•0 comments

How much compute does the world need?

https://www.ft.com/content/a5475746-510b-4b3f-8039-3fea1fb7c207
1•1vuio0pswjnm7•52m ago•0 comments