frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Show HN: Open-sourcing my personal market research

https://research.oguzbilgic.com
1•obilgic•1m ago•0 comments

Google hit with $2B antitrust judgment for skewing shopping searches

https://www.latimes.com/business/story/2026-07-01/google-hit-with-2-billion-antitrust-judgment-fo...
2•1vuio0pswjnm7•1m ago•0 comments

Show HN: A provider-agnostic agent loop built on ports and adapters

https://openagentloops.featherless.ai/
1•hopefulbutwary•3m ago•0 comments

No, There Wasn't An Advanced Civilization 12,000 Years Ago (2017)

https://www.scientificamerican.com/article/no-there-wasnt-an-advanced-civilization-12-000-years-ago/
2•optimalsolver•5m ago•0 comments

Champsfi – Real-time verifier agents for crowdsourced sports intelligence

https://champsfi.com/coming-soon.html
1•snhl•6m ago•0 comments

A New Catalog of Stellar Rotation Periods for over a Million Stars

https://aasnova.org/2026/07/01/a-new-catalog-of-stellar-rotation-periods-for-over-a-million-stars/
1•visha1v•6m ago•1 comments

The New Meta for Silicon Valley Startups Is Nihilism

https://gizmodo.com/the-new-meta-for-silicon-valley-startups-is-nihilism-2000780239
2•PLenz•7m ago•0 comments

Insights on Software Engineering, AI and DevOps Job Openings – July 2026

https://corvi.careers/blog/global_software-engineering_jobs_july_2026/
1•sp1982•8m ago•0 comments

Agents as Webs of Beliefs

https://www.lesswrong.com/posts/M39Z2CvyfaxZdaxR4/agents-as-webs-of-beliefs
1•gmays•9m ago•0 comments

Battleborn Battery Fire Aftermath and More Testing [video][5 Mins]

https://www.youtube.com/watch?v=nZtDDgdEh6c
1•Bender•9m ago•1 comments

Job seekers giving up: Labor force participation falls to lowest in 50 years

https://www.cnbc.com/2026/07/02/job-seekers-giving-up-labor-force-participation-rate-falls-to-low...
3•MilnerRoute•10m ago•1 comments

Show HN: PurRDF-High performance, cross platform RDF1.2

https://github.com/Blackcat-Informatics/purrdf/
1•paudley•10m ago•1 comments

Justices say Constitution protects people's location history

https://www.politico.com/news/2026/06/29/supreme-court-location-data-ruling-00979929
1•1vuio0pswjnm7•10m ago•0 comments

Will betting on wildfires lead to arson?

https://www.hcn.org/articles/people-are-betting-on-wildfires-should-they/
1•megamike•12m ago•0 comments

Stopping token burn because of agents sticking in a loop

1•driftguard•12m ago•0 comments

The FCC, Half a Century On

https://cerncourier.com/the-fcc-half-a-century-on/
1•visha1v•13m ago•1 comments

The Short Leash AI Coding Method for Beating Fable

https://blog.okturtles.org/2026/07/short-leash-ai-method/
2•Riseed•14m ago•1 comments

Rethinking Mean-Field Theory for Neural Networks

https://physics.aps.org/articles/v19/s81
2•visha1v•16m ago•1 comments

Claude-real-video - any LLM can watch a video

https://github.com/HUANGCHIHHUNGLeo/claude-real-video
1•cortexosmain•16m ago•0 comments

My quest to see all of Tetris

https://antithesis.com/blog/2026/tetris-quest/
1•wwilson•17m ago•0 comments

Entire introduces Blame

https://entire.io/blog/introducing-entire-blame
2•handfuloflight•17m ago•0 comments

The Most Effective Screen Time Passcode Is One You Can't Remember

https://mindfultech.bearblog.dev/the-most-effective-screen-time-passcode-is-one-you-cant-remember/
5•adesertrained•18m ago•0 comments

Microsoft's unreleased lightweight Edge-based Windows 11 AI OS leaks

https://www.neowin.net/news/microsofts-alleged-unreleased-lightweight-edge-based-windows-11-ai-os...
2•bundie•18m ago•0 comments

Seven Levels of RAG

https://martimchaves.com/#/blog/ragbandit-rag
1•martimchaves•18m ago•0 comments

Porfilr – Build a portfolio site in 10 minutes, no code

https://porfilr.com/
1•oasadiq•19m ago•0 comments

In "Stalin's Apostles," spies relied on their pedigrees to evade suspicion

https://www.csmonitor.com/Arts-Culture/Books/2026/0616/stalins-apostles-antonia-senior
1•Tomte•19m ago•0 comments

Nvidia own mockery of a bad product release (2003)

https://www.youtube.com/watch?v=H-BUvTomA7M
1•alikh31•21m ago•0 comments

Atomic Force Microscope high-speed video, stainless etching, bacteria, and more

https://www.youtube.com/watch?v=DyIQkqBXhS0
2•mhb•22m ago•0 comments

An analysis of title drops in movies

https://www.titledrops.net/
1•vinayak-shukla•23m ago•0 comments

I just found a program that will replace your marketing team

https://www.surgeos.app/
1•yernururu•25m ago•0 comments