frontpage.
newsnewestaskshowjobs

Open Source @Github

fp.

Open in hackernews

The Unreasonable Effectiveness of Reasonless Intermediate Tokens

https://arxiv.org/abs/2505.13775
4•YeGoblynQueenne•1y ago

Comments

tocs3•1y ago
I asked ChatGPT to restate this in more laymen's terms (posted below) and I am not to surprised at the answer.

"Lately, some AI models have shown impressive abilities to solve complex problems, and many people credit this to a method called Chain of Thought (CoT), where the model is trained to think through steps like a human might. In this paper, we take a closer look at that idea to see if it's really what's driving better performance.

We focus on the model’s step-by-step thinking (the words it generates along the way) — often treated like human "thoughts" — and examine whether these actually help the model solve problems more accurately. To test this, we train AI models using clean, correct step-by-step reasoning paths and final answers, all based on a known solving method (A* search). This lets us check both the final answers and the reasoning steps to see how they relate.

Interestingly, we find that even when a model gives the right answer, its reasoning steps can still be wrong or messy. To go further, we even train models using completely random and incorrect reasoning steps — and surprisingly, they still perform about the same, and sometimes even better, than those trained on correct steps.

This suggests that the step-by-step "thoughts" the model shows aren’t as meaningful or reliable as many assume. In short, just because a model looks like it’s reasoning through a problem doesn’t mean it actually is — and we should be careful not to treat its outputs as if it thinks like a human or follows strict logic."

A native graphical shell for SSH

https://probablymarcus.com/blocks/2026/06/28/native-graphical-shell-for-SSH.html
1•mrcslws•37s ago•0 comments

Everything in Git? No Way

https://legoraft.com/posts/everything-git-no-way/
1•speckx•50s ago•0 comments

Edison Fears Hidden Perils of the X-Rays (1903)

https://web.archive.org/web/20071111054257/http://home.gwi.net/~dnb/read/edison/edison_xrays.htm
1•joebig•52s ago•0 comments

So You Want to Fix Your All Hands

https://randsinrepose.com/archives/so-you-want-to-fix-your-all-hands/
1•mooreds•1m ago•0 comments

Cowboys, Frontiersmen, Settlers, Townspeople, Cityfolk

https://huntersoftwareconsulting.com/posts/2026-06-28-company-phase-changes/
1•mooreds•2m ago•0 comments

WSL container is now available for public preview

https://devblogs.microsoft.com/commandline/wsl-container-is-now-available-for-public-preview/
1•soheilpro•3m ago•0 comments

Silicon Valley Gets High on Its Own Supply

https://monkeynoodle.org/2026/06/27/silicon-valley-gets-high-on-its-own-supply/
1•mooreds•3m ago•0 comments

A TacoSprint 2026 Retrospective

https://fzakaria.com/2026/06/29/a-tacosprint-2026-retrospective
1•setheron•4m ago•1 comments

Show HN: Cline subscription plan to access GLM-5.2 at 2-5x discount

https://cline.bot/cline-pass
2•sdrzn•4m ago•0 comments

Flipper Device's new Busy Bar is a customizable display for productivity

https://techcrunch.com/2026/06/29/flipper-devices-new-busy-bar-is-a-customizable-display-for-prod...
1•CharlesW•5m ago•0 comments

How Small Postgres Metadata Tables Throttle Your Largest Queries

https://www.tigerdata.com/blog/small-postgres-metadata-tables
1•soheilpro•5m ago•0 comments

A reliable unprivileged container jail escape proof of concept for CentOs/RHEL

https://github.com/sgkdev/ipv6_frag_escape
1•eyberg•5m ago•0 comments

Show HN: PMB – local memory for coding agents that shows if it is used

https://pmbai.dev
1•oleksiibond•5m ago•0 comments

Cyberpunk Edgerunners 2 is releasing this year

https://www.kitguru.net/gaming/matthew-wilson/cyberpunk-edgerunners-2-is-releasing-this-year/
1•andsoitis•5m ago•0 comments

Ask HN: Homeless, Former Software Developer, What Now?

3•current_robot•6m ago•0 comments

Truckloads of Tesla Batteries Keep Getting Stolen Before They Leave the Factory

https://www.wired.com/story/truckloads-of-tesla-batteries-keep-getting-stolen-before-they-even-le...
1•WalterGR•7m ago•0 comments

Testing is f***ing awesome

https://mike.gg/testing-is-fucking-awesome
3•miketromba•7m ago•0 comments

Adobe to Acquire Topaz Labs

https://news.adobe.com/
1•razerbeans•9m ago•0 comments

Open Source Software Is the Pastime of the Rich

https://humancode.us/2024/09/18/open-source-pastime-for-rich
2•jllyhill•9m ago•1 comments

Sound Effects for Free Use from the Finnish Broadcasting Company, Yle Archives

https://freesound.org/people/YleArkisto/
2•DamonHD•9m ago•0 comments

Ask HN: How do you handle QA at a startup with no QA team? Genuinely curious

1•ovi_firstqa•9m ago•0 comments

Digital Sovereignty at the UN

https://www.zdnet.com/article/digital-sovereignty-un-global-push-to-replace-us-cloud-giants-with-...
2•CrankyBear•10m ago•0 comments

Loop Engineering Is Just Software Engineering

https://iii.dev/blog/loop-engineering-is-just-software-engineering/
2•appplemac•11m ago•0 comments

Subveris – A clean, free dashboard to track your subscriptions

https://www.subveris.com/
1•AlexiDonck•11m ago•1 comments

Microsoft Needs Windows Lite

https://philipbohun.com/blog/0011.html
1•pbohun•12m ago•0 comments

Leaked A20 Pro Image Hints at iPhone 18 Pro Performance Gains

https://www.macrumors.com/2026/06/29/leaked-a20-pro-image-iphone-18-pro-performance/
1•CharlesW•12m ago•0 comments

We took away psychological safety and then told everyone to be more productive

https://medium.com/design-bootcamp/we-took-away-psychological-safety-and-then-told-everyone-to-be...
3•speckx•12m ago•0 comments

AMD contributes their GPU support to tiny-vLLM

https://github.com/jmaczan/tiny-vllm/pull/2
2•yu3zhou4•12m ago•0 comments

Hotels cheat with AI-optimised images

https://hospitalityinside.com/en/Hotels-cheat-with-AI-optimised-images
1•bushwart•13m ago•2 comments

Accurate Decoding of Natural Sentences From Non-Invasive Brain Recordings

https://ai.meta.com/research/publications/accurate-decoding-of-natural-sentences-from-non-invasiv...
1•ilreb•14m ago•0 comments