frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

Gen Z Is in Falling Behind. Dr. Jared Cooney Horvath Argues How to Help Them [video]

https://www.youtube.com/watch?v=_y03mB01beE
1•phoronixrly•43s ago•0 comments

Show HN: Chunk sidecars for validating agent-generated code before pushing to CI

https://circleci.com/blog/chunk-sidecars/
1•olafmol•1m ago•0 comments

Once you hit about a 20-point IQ gap, communication starts to break down

https://twitter.com/i/status/2058730279804748191
1•Michelangelo11•1m ago•0 comments

DirecTV's Secret War on Hackers (2001)

https://hardware.slashdot.org/story/01/01/25/1343218/directvs-secret-war-on-hackers
1•striking•1m ago•0 comments

The mysterious Hy3 LLM is topping OpenRouter Model Rankings by a large margin

https://minimaxir.com/2026/05/openrouter-hy3/
1•minimaxir•3m ago•0 comments

Soon We Can Banish JavaScript to the ShadowRealm

https://css-tricks.com/soon-we-can-finally-banish-javascript-to-the-shadowrealm/
1•mooreds•4m ago•0 comments

CISA orders feds to patch actively exploited Drupal vulnerability

https://www.bleepingcomputer.com/news/security/cisa-orders-feds-to-patch-actively-exploited-drupa...
1•Brajeshwar•5m ago•0 comments

What drawing lines on football pitch taught me about future of human-AI collab

https://singhkays.com/blog/drawing-lines-football-pitch-human-ai/
1•singhkays•6m ago•1 comments

Language Models Need Sleep

https://arxiv.org/abs/2605.26099
1•juxtapose•6m ago•0 comments

Prompt Is Not Runtime

https://lixinge.substack.com/p/prompt-is-not-runtime
1•xgli•6m ago•0 comments

Show HN: Show off your software products and gather a newsletter following

https://cantible.com
1•cwbuilds•7m ago•0 comments

Show HN: Clean Gigabytes of Junk from Your Mac

https://mydevcleaner.com/
1•pcwine•7m ago•0 comments

Gtamaplib: Tools and interfaces to construct and navigate the map of GTA 6

https://github.com/rolux/gtamaplib-vc
1•rolux•7m ago•0 comments

'Incredible' milestone reached as Sweden becomes a smoke-free country

https://www.thelocal.se/20260525/incredible-milestone-reached-as-sweden-becomes-a-smoke-free-country
2•Teever•8m ago•0 comments

Wikimedia/structured-Wikipedia · Datasets at Hugging Face

https://huggingface.co/datasets/wikimedia/structured-wikipedia
1•Tomte•8m ago•0 comments

Connecting community organizations with cybersecurity volunteer groups

https://www.cybervolunteers.us
1•mooreds•9m ago•0 comments

Show HN: We made a cinematic heist trailer with 4 AI models for $60

https://medium.com/@jpelton722/we-made-a-cinematic-heist-trailer-with-ai-for-60-heres-exactly-how...
2•jpelton•10m ago•0 comments

China Limits Overseas Travel for AI Talent at DeepSeek, Alibaba, Private Firms

https://www.bloomberg.com/news/articles/2026-05-26/china-expands-travel-curbs-to-top-ai-talent-at...
2•htrp•12m ago•0 comments

Can HawaiʻI Deliver All of America from Citizens United?

https://www.civilbeat.org/2026/04/can-hawaii-deliver-all-of-america-from-citizens-united/
1•surprisetalk•12m ago•0 comments

Dog shoots woman with shotgun at Nebraska convenience store

https://www.theguardian.com/us-news/2026/may/26/dog-shotgun-nebraska-convenience-store
1•dwa3592•16m ago•0 comments

Breaking Bot: Hacking and Defending LLM-Based Applications

https://www.szia.ai/post/hacking-ai-how-people-break-llms
1•mszel•17m ago•0 comments

Sonny Rollins Is Dead

https://www.nytimes.com/2026/05/25/arts/music/sonny-rollins-dead.html
1•diwank•18m ago•0 comments

Show HN: MCPs aren't enough, give Codex/Claude accurate memory of everything

https://timeglass.ai
7•midas•19m ago•0 comments

Maybe Programmers are Just Bad [video]

https://www.youtube.com/watch?v=qqUgl6pFx8Q
1•tosh•20m ago•0 comments

Sometimes, the Best Way to Explore a Landscape Is to Sit Down

https://www.nytimes.com/2026/05/26/realestate/gardening-madoo-sagaponack-bob-dash.html
1•mooreds•21m ago•0 comments

Quest: Training Frontier Deep Research Agents with Synthetic Tasks

https://arxiv.org/abs/2605.24218
2•Brajeshwar•22m ago•0 comments

Fifty years of 'More is different (2022)

https://ora.ox.ac.uk/objects/uuid:b009809d-be1d-4bb6-9cf8-2380124fb7e6
1•cratermoon•22m ago•0 comments

Investigation update: GitHub Enterprise Server signing key rotation

https://github.blog/security/investigating-unauthorized-access-to-githubs-internal-repositories/
1•brianmcnulty•22m ago•0 comments

Flipper Zero Can Detect Flock Cameras

https://old.reddit.com/r/TikTokCringe/comments/1to4xfh/dystopia_speed_run/
1•skadamat•23m ago•0 comments

Aperion Shield v0.7 – guardrails for AI coding agents now run as Git hooks

https://github.com/AperionAI/shield/releases/tag/shield-v0.7.0
1•ScottAperion•24m ago•0 comments