frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Nvidia Dynamo Snapshot: Fast Startup for Inference Workloads on Kubernetes

https://developer.nvidia.com/blog/nvidia-dynamo-snapshot-fast-startup-for-inference-workloads-on-...
1•berlianta•3m ago•0 comments

Pimmur, can LLM simulate human collective behavior?

https://arxiv.org/abs/2509.18052
1•xiaoluolyg•4m ago•0 comments

A survey of HTTP 'server' headers

https://alex.keeling.me/blog/server-survey.html
1•TwinOaks65•5m ago•0 comments

AIPass – Persistent agent workspace with identity, memory, and email

https://github.com/AIOSAI/AIPass
1•AIOSAI•6m ago•0 comments

Stratasys snaps up Markforged in $42.5M deal

https://all3dp.com/6/stratasys-snaps-up-markforged-in-42-5m-deal/
1•iancmceachern•7m ago•0 comments

Progressives Are Listening to the Wrong People on A.I

https://www.nytimes.com/2026/05/26/opinion/progressives-left-ai.html
3•cratermoon•9m ago•0 comments

Are your creative projects livestock or pets?

https://herbertlui.net/are-your-creative-projects-livestock-or-pets/
2•herbertl•11m ago•0 comments

Show HN: Teleport-env – <500ms stateful rollbacks for AI agents via CRIU

https://github.com/JaiCode08/teleport-env
1•Jainish08•19m ago•0 comments

Unlocking Liquidity on Prediction Markets [pdf]

https://docs.lattica.finance/unlocking-liquidity.pdf
1•stephenflanders•19m ago•0 comments

Love Language – The undying dream of Esperanto

https://harpers.org/archive/2026/06/love-language-katie-thornton-esperanto/
2•pseudolus•20m ago•0 comments

Show HN: What 1k Harness Experiments Taught Me About Self-Improving Agents

https://www.henrypan.com/blog/2026-05-25-self-improvement-harness/
1•megadragon9•21m ago•0 comments

Meta to start testing AI subscription services

https://www.cnbc.com/2026/05/27/meta-testing-ai-subscription-services-cheapest-plan-at-7point99-a...
1•mattas•21m ago•0 comments

Google engineer charged with insider trading after making $1.2M on Polymarket

https://techcrunch.com/2026/05/27/google-engineer-charged-with-insider-trading-after-making-1-2m-...
4•evo_9•25m ago•0 comments

Bubbles – an HN-like link aggregator for the non-tech internet

https://bubbles.town/
1•Curiositry•26m ago•0 comments

Planescape: Torment, Part 1: From the Tabletop

https://www.filfre.net/2026/05/planescape-torment-part-1-from-the-tabletop/
2•doppp•27m ago•0 comments

Zig 2026: No-AI Policy, $670K Foundation, Left GitHub and Why Zig Isn't 1.0 [video]

https://www.youtube.com/watch?v=iqddnwKF8HQ
4•doppp•28m ago•0 comments

Dario Amodei warned of an AI white-collar bloodbath, now he's changing narrative

https://fortune.com/2026/05/05/dario-amodei-jevons-paradox-will-ai-wipe-out-white-collar-jobs/
3•gmays•29m ago•0 comments

Microsoft's New Governance Toolkit MCP Extensions

https://medium.com/c-sharp-programming/securing-your-net-f2020c72027e
1•sukhpinder0804•29m ago•0 comments

Do they know we can tell it's AI slop?

5•jwsteigerwalt•30m ago•2 comments

Connectix RAM Doubler

https://computeradsfromthepast.substack.com/p/connectix-ram-doubler
1•myth_drannon•33m ago•0 comments

Building complex functions out of real parts

https://www.johndcook.com/blog/2026/05/22/complex-functions-real-parts/
1•ibobev•35m ago•0 comments

Real and Imaginary Parts

https://www.johndcook.com/blog/2026/05/23/real-and-imaginary-parts/
2•ibobev•35m ago•0 comments

Fitting the parameters of a Besace curve like the Meta logo

https://www.johndcook.com/blog/2026/05/27/the-meta-logo-and-fitting-besace-curves/
1•ibobev•36m ago•0 comments

Tabby's Star (KIC 8462852)

https://en.wikipedia.org/wiki/Tabby%27s_Star
2•Jimmc414•38m ago•0 comments

Language Modeling Materializes a World Model of Protein Biology [pdf]

https://biohub.ai/papers/esm_protein.pdf
1•y1zhou•41m ago•0 comments

Modos's open-hardware 13.3″ color e-paper monitor goes live on Crowd Supply

https://www.crowdsupply.com/modos-tech/modos-flow
2•Curiositry•43m ago•0 comments

Justice Dept. Is Said to Open Criminal Inquiry of E. Jean Carroll

https://www.nytimes.com/2026/05/27/us/politics/criminal-inquiry-e-jean-carroll-trump-accusations....
4•JumpCrisscross•44m ago•0 comments

Satire with Permits: A Brushed-Metal Pole Targeting Epstein Headed to Wisconsin

https://easternherald.com/2026/05/28/consentivus-pole-epstein-files-capitol-chaz-stevens-first-am...
1•ChazStevens•46m ago•0 comments

Ghost CMS flaw abused to push ClickFix attacks on sites

https://securityaffairs.com/192655/cyber-crime/ghost-cms-flaw-abused-to-push-clickfix-attacks-on-...
1•mooreds•46m ago•0 comments

Show HN: Verbum Vitae – Bible memorization [pt]

https://vvitae.com
2•barddoo•47m ago•0 comments