frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Un-0: Generating Images with Coupled Oscillators

https://unconv.ai/blog/introducing-un-0-generating-images-with-coupled-oscillators/
2•babelfish•19s ago•0 comments

We got DeepSeek-V4-Pro serving in 20 seconds

https://inferize.ai/blog/restoring-live-multi-gpu-llms-in-seconds
2•talhof8•1m ago•0 comments

Recording 10k user sessions cost us less bandwidth than a single HD video

https://rejourney.co/engineering/2026-05-06/mobile-session-replay-cost
2•mrr7337•1m ago•0 comments

Datadog costs 7× more than self-hosting for the same telemetry

https://www.jslet.com/observability-cost
1•ReyX•2m ago•0 comments

Trump administration asks OpenAI to stagger release of new model

https://ca.finance.yahoo.com/news/trump-administration-asks-openai-stagger-204300837.html
1•fla•2m ago•0 comments

Show HN: CtxGov – see what instructions your AI agent inherits before it runs

https://github.com/ctxgov/ctxgov
1•LuxBennu•2m ago•0 comments

Why Germans don't have air conditioning

https://www.dw.com/en/why-germans-dont-have-air-conditioning/a-77685580
1•N19PEDL2•6m ago•0 comments

Why Jane Street, a US trading giant, is in trouble in India

https://www.bbc.com/news/articles/c5y0zgrevl1o
1•thisislife2•6m ago•0 comments

Record Type Inference for Dummies

https://haskellforall.com/2026/06/record-type-inference-for-dummies#user-content-fnref-3
1•fanf2•8m ago•0 comments

OpenAI Leans Toward Waiting Until Next Year for IPO

https://www.nytimes.com/2026/06/25/technology/openai-ipo-artificial-intelligence.html
7•mfiguiere•14m ago•0 comments

Dan Abramov has joined the Next.js team

https://bsky.app/profile/danabra.mov/post/3mp5b3nd3ws2k
2•frandroid•15m ago•0 comments

Cloudflare Collaborates with Leading Browsers to Develop Privacy-First Protocol

https://www.cloudflare.com/press/press-releases/2026/cloudflare-collaborates-with-leading-browser...
3•mooreds•16m ago•0 comments

Om Malik, 1966-2026

https://om.co/2026/06/24/1966-2026/
12•minimaxir•16m ago•0 comments

Is AI Good at Stock-Market Timing? A New Study Casts Doubt

https://www.wsj.com/tech/ai/ai-stock-market-trading-research-154eeb72
2•fortran77•16m ago•1 comments

Show HN: Topos – Structural code quality metrics for agent-written programs

https://krv.ai/field-notes/evaluating-code-generation
3•wayland_jeremy•17m ago•0 comments

Telegram Users Are De-Anonymized (unemployed edition) [video]

https://www.youtube.com/watch?v=vVd-ZSLczPo
1•Imustaskforhelp•19m ago•1 comments

The Customer Who Almost Killed Slack, Stripe, and Airbnb

https://siliconopera.com/the-customer-who-almost-killed-slack-stripe-and-airbnb/
4•nate•19m ago•1 comments

Show HN: I built a hardware quantum RNG and wired it into a Magic 8-Ball

https://dnhkng.github.io/posts/building-the-beam-universe-splitter/
1•dnhkng•20m ago•0 comments

OpenAI to Stagger Release of GPT 5.6 at Request of U.S. Government

https://velo.xyz/news/1908
15•217•21m ago•6 comments

Title: Show HN: AssertGo – Fluent Assertion Library for Go

1•duckydude20•24m ago•0 comments

Steven Heller's Font of the Month: Puffery

https://ilovetypography.com/2026/05/05/steven-hellers-font-of-the-month-puffery/
3•whiteblossom•24m ago•0 comments

2026 Venezuela Earthquakes

https://en.wikipedia.org/wiki/2026_Venezuela_earthquakes
2•layer8•26m ago•0 comments

Less (About) Tech

https://www.thebacklog.net/2026/06/25/less-tech/
2•dirkc•26m ago•0 comments

RetainFlow – Subscription Retention for WooCommerce

https://wordpress.org/plugins/retainwoo/
1•techstuff123•27m ago•0 comments

Generalist SWEs were a product of cheap money

https://blog.grandimam.com/posts/repricing-of-software-engineering-labor/
3•grandimam•27m ago•0 comments

Show HN: 1000songs.co.uk – 100 years of music, 10 tracks a year

https://1000songs.co.uk/
1•farnhamancestry•28m ago•2 comments

La Stim Machine [Fr]

https://www.ldlc.com/fiche/PB00745922.html
2•thesnide•30m ago•1 comments

Hasbro's TV Contracts Ask Child Voice Actors to Sign Rights Away for AI Use

https://www.hollywoodreporter.com/business/business-news/studio-minor-performers-surrender-voices...
8•ilamont•32m ago•0 comments

Show HN: Mouse.party

https://mouse.party
1•willmeyers•35m ago•0 comments

An oral history of Bank Python (2021)

https://calpaterson.com/bank-python.html
3•tosh•35m ago•0 comments